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SECTION  1.  INTRODUCTION 

The  cost  of  developing  large  scale  ocmputer  systems  has  increased 
dramatically  in  the  last  few  years.  In  spite  of  more  sophisticated 
design  techniques,  many  systems  fail  to  meet  cost,  schedule, 
oost-benefit,  and  performance  objectives;  many  systems,  once  carpi eted, 
do  not  perform  as  well  as  they  are  expected  to;  many  systems  never  even 
get  oanpletedi 

Current  software  system  development  methodologies  emphasize  a 
process  in  v*hich  development  is  conceived  as  proceeding  through  a  series 
of  phases.  Each  phase  is  organized  to  canplete  a  specific  planned 
process  and  produces  output  in  terms  of  information  or  design  documents 
that  are  input  to  the  next  phase.  Most  attempts  to  improve  the 
effeciency  of  the  development  cycle  have  concentrated  on  improving  the 
processes  which  comprise  some  single  phase.  Structured  progranming 
focuses  on  the  progranming  stage  of  the  development  phase  while  composite 
design  applies  to  the  design  stage  of  the  development  phase. 

There  is  a  need,  however,  for  design  validation  at  less  than 
full-system  cost,  and  for  prototyping  design  alternatives.  The  use  of 
integrated  scaled  systems  presents  such  a  technique. 

Scaled  systems  are  operational  systems  implementing  subsets  of 
capabilities  and/or  performance  characteristics  of  the  ultimate 
full-scale  system.  The  scaled  system  approach  is  intended  to  bridge  the 
gap  between  the  definition  and  design  stages  of  the  development  phase. 

Using  scaled  system  concepts  for  the  design,  development,  and 
evaluation  of  intelligence  data  handling  computer  systems  is  expected  to 
improve  the  way  these  tasks  are  performed.  By  implementing  a  subset  of 
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the  capabilities  of  a  full-scale  system,  a  "scaled  system",  it  is 
anticipated  that  the  initial  expenditure  on  the  scaled  system,  a  fraction 
of  the  cost  of  the  full-scale  one,  will  decrease  the  overall  full-scale 
system  cost,  schedule,  and  risk.  Because  scaled  systans  are  operational 
systems,  users  can  immediately  obtain  the  benefits  available  frcm  partial 
automation  of  their  requirements. 

The  use  of  scaled  systems  within  a  development  effort  can  have 
several  benefits.  However,  only  same  of  the  benefits  may  be  applicable 
to  any  specific  development  effort.  Which  of  the  benefits  are  desirable 
will  determine  the  objectives  for  using  scaled  systems  within  the 
development  effort.  Once  these  objectives  are  established,  the  precise 
manner  in  which  the  scaled  system  should  be  defined  from  the  full-scale 
one  can  be  determined.  Knowledge  of  benefits  realizable  from  the 
application  of  scaled  systans  is  therefore  vital  to  understanding  the 
scaled  system  technique,  so  potential  benefits  are  listed  be lew. 

a.  Users  can  begin  using  a  scaled  system  as  soon  as  it  is 
implemented,  since  scaled  systems  are  operational  systems.  Feedback  fran 
users  can  guide  final  design  decisions  for  the  full-scale  system.  This 
benefit  is  particularly  important  in  instances  vdiere  users  are  unable  to 
clearly  specify  their  requirements  for  automated  support  due  to  their 
lack  of  experience  with  computers  or  to  the  unique  nai_ure  of  the  tasks 
they  desire  to  automate .  The  scaled  system  can  be  used  to  demonstrate 
exactly  what  capabilities  are  available  to  the  user  as  well  as  give  the 
user  an  idea  of  hew  he  will  interface  with  the  system  and  v%hat  procedures 
must  be  developed.  Based  cn  his  experience  with  a  scaled  system,  the 
user  will  then  be  able  to  clearly  specify  his  requirements  for  the 
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full-scale  system,  thereby  greatly  increasing  the  probability  of  success 
for  the  overall  development  effort. 

b.  Different  techniques  for  performing  unique  or  state-of-the-art 
operations  can  be  tried  with  scaled  systems,  in  order  to  establish 
feasibility  of  complex  designs  or  to  determine  the  optimal  way  to  provide 
certain  capabilities  within  different  environments. 

c.  The  team  developing  a  scaled  systan  obtains  valuable  experience 
with  the  project  that  increases  their  productivity  when  developing  the 
full-scale  system.  Design  lessons  learned  from  the  scaled  system  also 
decrease  the  nunber  of  false  starts  and  blind  alleys  encountered  during 
full-scale  development. 

d.  In  many  instances  a  scaled  system  can  be  incrementally  expanded 
to  eventually  implement  the  desired  full-scale  system.  The  incremental 
development  approach  is  usually  more  cost-effective  than  is  an  attempt  to 
implement  an  entire  large-scale  system  at  once  in  a  turnkey  fashion. 

e.  The  cost  and  schedule  for  scaled  system  development,  once  that 
development  is  complete,  can  be  used  as  a  predictor  for  the  cost  and 
schedule  of  full-scale  system  development.  This  effort  has  examined  hew 
reliable  predictors  can  be  established. 

f.  The  performance  of  a  scaled  system  can  be  used  as  a  predictor 
for  the  performance  of  the  corresponding  full-scale  system.  Full-scale 
systems  often  fail  to  meet  their  performance  objectives,  and  the  use  of  a 
scaled  system  may  indicate  that  a  redesign,  increase  of  system  resources, 
and/or  relaxation  of  performance  objectives  is  required  to  achieve  the 
desired  full-scale  system  performance.  In  cases  vhere  the  scaled  system 
indicates  that  the  desired  performance  is  achievable,  the  performance 
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predicted  by  the  scaled  system,  can  be  used  in  the  evaluation  of  the 
full-scale  systtan  ultimately  implemented,  thereby  reducing  the  risk  of 
implementing  systems  with  inadequate  performance.  This  effort  has 
researched  the  development  of  reliable  performance  predictors  for  scaled 
systems. 

A  scaled  system  is  implemented  during  definition  or  design  stages  in 
the  life  cycle  of  full-scale  system  devvloirvrt  .  The  scaled  system  may 
be  developed  based  on  the  functional  desc;  lpt  ion  for  the  full-scale 
system,  or,  in  certain  instances,  buso:  nr:  the  system  rectification.  It 
is  desirable  to  implement  the  scaled  system,  as  early  m  the  development 
cycle  as  possible,  as  experience  gained  with  the  scaled  system  can 
provide  valuable  insight  for  later  full-scale  system  design.  Thus,  the 
preferred  approach  is  that  the  scaled  system  be  implemented  based  on  the 
full-scale  functional  description,  and  that  the  full-scale  system 
specification  be  developed  based  on  the  scaled  system.  It  should  be 
noted  that  the  scaled  system  has  its  own  development  cycle  similar  to 
that  of  the  full-scale  system,  except  with  much  shorter  schedules. 

The  scaled  system  originally  implemented  as  a  design  tool  can  then 
be  used  again  during  the  evaluation  phase  of  the  full-scale  system 
developent  cycle.  This  research  has  investigated  techniques  for 
predicting  full-scale  system  performance  based  on  scaled  system 
performance.  Thus,  measurements  made  on  the  scaled  system  can  originally 
be  used  to  predict  full-scale  system  performance,  and  can  later  be  used 
to  evaluate  how  well  the  implemented  full-scale  system  achieved  those 
performance  predictions. 
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Section  2  discusses  the  research  methodology,  including  the 
objectives  of  the  effort  and  how  they  were  achieved,  in  particular,  in 

terms  of  the  simulator  and  cost  model  developed  in  this  effort.  Section 

* 

3  describes  the  specific  results  of  the  research,  including  the 
definitions  of  scale  factor  metrics,  system  parameter  interrelationships, 
guidelines  on  scaling  system  scale  factors,  decision  factors  and 
guidelines  indicating  Vvhen  to  use  scaled  systens  as  part  of  a  design 
effort,  and  anticipated  cost  benefits  of  employing  scaling  techniques. 
Section  4  discusses  research  efforts  that  will  be  fruitful  areas  for 
fur tlier  investigation. 
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SECTION  2.  METHODOLOGY 


The  objective  of  this  effort  was  to  conduct  research  to  define  the 
applications  of  scaled  systems  as  design  instruments  for  designing, 
developing,  and  evaluating  intelligence  systems,  in  order  to  provide  a 
concrete  means  of  investigating  and  ascertaining  the  various  factors 
that  are  pertinent  to  the  application  of  scaled  systems.  Various 
elements  of  software  systans,  "System  Scale  Factors, "  were  evaluated  with 
the  specific  objective  of  identifying  the  elements  most  suitable  to  snail 
scaled  applications.  These  items  were  then  quantified  to  provide  a 
uniform  and  standardized  terminology  allowing  objective  categorization  of 
scaled  systems,  based  cn  the  corresponding  full-scale  system.  In  order  to 
determine  the  interrelationships  among  these  scale  factors,  so  that  they 
may  be  considered  in  the  overall  system  methodology  for  using  scaled 
systems,  a  concept  was  evolved  that  uses  a  simulation  model  of  a 
generalized  IDHS  to  predict  performance  and  to  predict  changes  in  one 
scale  factor  variable  frcm  changes  in  another. 

Scaled  systems  techniques  were  developed  to  provide  better  estimates 
of  total  development  cost,  schedule,  and  performance,  by  defining 
decision  factors  for  using  scaled  systems,  in  order  to  indicate  when 
scaled  systems  should  be  used  as  part  of  a  design  effort.  The  decision 
factors  are  to  provide  justification  in  terms  of  ultimate  full-scale 
system  cost,  schedule,  risk,  and  performance,  for  using  a  scaled  system. 
A  preliminary  integrated  cost  model,  synthesizing  the  best 
characteristics  of  the  models  studied  into  a  single  model  suitable  for 
scaled  systems  research,  was  implemented  and  calibrated  with  data  derived 
frcm  analysis  of  an  actual  intelligence  system,  the  Defense  Intelligence 
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Agency  (D1a)  Integrated  Indications  System  (DIIS),  in  order  to  place  the 
decision  factor  guidelines  cn  a  firm  quantitative  footing. 

The  objective  was  then  to  identify  specific  benefits  realizable  fr cm 
the  scaled  systems  approach  by  analyzing  past  systems  developed  and 
comparing  an  actual  scaled  system  to  its  full-scale  counterpart,  namely 
the  NMIC  system  and  INCO's  scaled  version  of  the  IWIC's  User  Support 
Subsystem  (USS)  called  the  Indications  and  Warning  Training  System: 
(IWTS).  These  tv*o  systems  (NMIC  and  IWTS)  were  ccrnpared  and  contrasted 
in  terms  of  their  relative  size,  cost,  hardware  configuration,  software 
implementation,  ocmplexity,  difficulty,  and  effort  expended  to  complete 
them,  as  far  as  the  data  permitted  such  analysis. 

Section  2.1  describes  the  design  of  the  overall  effort.  Section  2.2 
describes  the  operating  system  performance  simulator  and  Section  2.3  the 
cost  models  designed  for  evaluation  of  actual  and  proposed  scaled 
systems. 

2.1  General 

In  order  to  define  the  scaled  system  methodology,  two  similar  types 
of  relationships  were  considered  in  this  effort:  (1)  how  the 

performance  of  a  scaled  system  compares  to  that  of  a  full-scale  system 
and  (2)  hew  the  use  of  a  scaled  system  affects  the  total  cost,  schedule, 
and  risk  of  a  system  development  effort.  The  first  type  of  relationship 
is  required  to  predict  the  performance  of  a  full-scale  system  based  on 
that  of  a  scaled  system.  While  the  second  type  of  relationship  is 
required  to  judge  the  benefit  of  using  a  scaled  system  as  part  of  a 
development  effort.  Doth  types  of  relationship,  taken  together,  are  also 
required  to  determine  precisely  which  system  parameters  should  be  scaled. 


and  by  what  amount,  to  take  maximal  advantage  of  a  scaled  system  within  a 
given  development  effort. 

The  way  in  which  the  elanents  of  the  technical  approach  carbine  to 
satisfy  the  total  research  objectives  can  be  sutmarized  as  follows: 

o  Identify  system  parameters  that  are  suitable  for 
seal ing . 

o  Define  scale  factors  for  each  of  these  parameters. 

o  Examine  the  correlations  and  interrelationships  among 
scale  factors. 

o  Use  these  correlations  for  developing  guidelines  of 
which  parameters  to  scale  and  how  much,  based  on 
system  objectives. 

o  Prepare  a  list  of  decision  factors  that  are 
indicative  of  whether  or  not  scaled  systems  should  be 
used  as  part  of  a  development  effort. 

o  Develop  guidelines  for  whether  or  not  scaled  systems 
should  be  used  based  on  these  decision  factors. 

o  Identify  specific  benefits  realizable  frem  the  scaled 
systems  approach  for  past  systems  developed  and 
future  systems  to  be  developed. 

o  Quantify  benefits  for  planned  systems  realizable 
through  the  use  of  scaled  systems. 

2.2  The  INOO  System  Performance  Simulator 

The  INCO  system  performance  simulator  (ISPS)  is  an  event-driven 
simulator  designed  to  execute  on  INCO's  interactive  microprocessor-based 
computer  systems.  The  simulator  models  a  generalized,  variable  computer 
system  configuration  consisting  of  a  CPU,  a  disk,  a  user-specified  number 
of  on-line  terminals,  and  the  associated  system  queues  necessary  to 
simulate  the  allocation  of  these  resources.  A  detailed  abstract 


technical  discussion  of  the  simulator  can  be  found  in  Appendix  F  and 
discussions  concerning  its  operation  and  method  of  application  to  this 


research  can  be  found  in  the  earlier  portions  of  this  section.  The 
objective  of  this  discussion  is  to  highlight  the  simulator’s  functional 
characteristics . 

The  simulator  was  designed  in  a  programmer '  s  design  language  (PDL) 
and  subsequently  ooded  into  FORTRAN .  Its  purpose,  as  previously  stated, 
was  to  model  a  variable  computer  system  environment.  This  variable 
environment  is  specified  by  the  simulator’s  user  by  way  of  a  description 
of  the  system  configuration's  ocmponent  characteristics.  These  input 
parameters  are  specified  by  the  user  at  run-time  through  an  interactive 
query.  The  input  parameter  set  and  its  format  is  illustrated  in  Figure 
2 -Cl .  This  is  the  same  query  the  user  iterates  through  before  simulator 
execution . 

During  the  simulation,  the  user  may  optionally  observe  the  steps  the 
simulator  makes  through  a  video  display  that  is  updated  by  the  simulator 
at  the  occurrence  of  each  new  simulator  e»’ent.  The  execution  speed  of 
the  simulator  is  increased,  however,  if  the  user  selects  the  "truncated" 
terminal  display  format  as  opposed  to  this  "extended"  format  which 
requires  the  additional  processing  overhead  of  the  terminal  I/O  in  order 
to  periodically  update  the  display.  The  screen  display  is  illustrated  in 
Figure  2-02.  A  sample  simulator  performance  output  ir  shewn  in  Figure 
2-03. 

Using  this  simulator,  an  analyst  can  explore  the  rudimentary 
performance  characteristics  of  varying  computer  system  hardware 
configurations  as  well  as  the  effects  of  generalized  job-type  mixes.  Job 
types  are  classified  as  either  CPU-  or  disk-bound  for  purposes  of  the 
simulation.  For  example,  the  simulator  can  help  the  analyst  determine 
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SPSIM  INPUT  PARAMETERS 


(a)  Nutter  of  terminals? 

(b)  %  Percentage  mix  between  CRJ-  &  Disk-  bound  jobs? 

(c)  Mean  CPU  service  time  (CPU  bound  jobs)? 

(d)  Mean  CPU  service  time  (Disk  bound  jobs)? 

(e)  Mean  Disk  service  time  (CPU  bound  jobs)? 

(f)  Mean  Disk  service  time  (Disk  bound  jobs) ? 

(g)  Mean  CPU/Disk  iteration  count  (CRJ  bound  jobs)? 

(h)  CPU/Disk  iteration  count  std.  dev.  (CPU  bound  jobs)? 

(i)  Mean  CPU/Disk  iteration  count  (Disk  bound  jobs)? 

(j)  CPU/Disk  iteration  count  std.  dev.  (Disk  bound  jobs)? 

(k)  Mean  wait  time  for  terminal  #  <  1-100>? 

(l)  Std.  dev.  about  wait  time  for  terminal  #  <1-100;?? 

(m)  Extended  (F)  or  Truncated  (T)  Screen  Display? 


Figure  2-01  Simulator  Input  Parameter  Set 
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Figure  2-02.  Simulator  Screen  Display 


INQO  SYSTEM  PERFORMANCE  SIMUIATOR  RESULTS 


Garment:  HUN  #69 

Time  and  date  of  run:  11 : 36-APRIL  27,  1981 

INPUT  PARAMETER  SUMMARY 


MEAN  VALUE 

Service  times: 

1.25 

CPU: 

CPU  -  Bound  jobs 

1.25 

Disk  -  Bound  jobs 

1.40 

Disk: 

CPU  -  Bound  jobs 

35.00 

Disk  -  Bound  jobs 

40.00 

Iteration  counts: 

CPU  -  Bound 

10.00 

Disk  -  Bound  jobs 

30.00 

Terminal  Delay  Times: 

Terminal  #  1 

50.00 

Terminal  #  2 

50.00 

Terminal  #  3 

50.00 

Terminal  #  4 

50.00 

Terminal  #  5 

50.00 

Terminal  #  6 

50.00 

Terminal  #  7 

50.00 

Terminal  #  8 

50.00 

Terminal  #  9 

50.00 

Terminal  #10 

50.00 

Number  of  On-line  terminals: 

10. 

Job  Mix  (ratio  of  CPU/Disk  bounds  jobs) 

:  50.00 

STAND.  DEM. 


0.00 

0.00 

0.00 

0.00 

0.00 


1.00 

3.00 


0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 


Figure  2-03.  Sample  Simulator  Performance  CXatput 


INOO  SYSTEM  PERFORMANCE  SIMULATOR  RESULTS 


Cement:  FUN  #61 

Time  and  date  of  run:  11 : 36-APRIL  27,  1981 

SIMULATION  RESULTS 

Terminal  Responsiveness: 

Terminal  #  Jobs  Queued/Canpleted  Average  Response 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 


System  Performance  Sunnary: 
Number  of  Terminals 
Number  of  Jobs  Submitted 
Number  of  Jobs  Completed 
Elapsed  Time 
Average  Responsiveness 
Hardware  Utilization: 

CPU  - 
Disk  - 


6/ 

6 

3571.81 

7/ 

7 

3292.74 

4/ 

4 

5699.30 

5/ 

5 

4626.85 

8/ 

8 

2878.71 

7/ 

7 

3086.97 

6/ 

6 

3879.37 

5/ 

5 

4509.58 

6/ 

6 

3505. 37 

7/ 

7 

2968.22 

= 

10 

= 

61 

= 

61 

= 

23276. 

= 

3650.56 

— 

3.48% 

= 

99.93% 

Queue  Summary  Average  #  in  Queue  Average  Wait  Time 

CPU  -  .00  1.08 

Disk  -  8.41  322.90 


Figure  2-03.  (Continued) 
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the  relative  inpacts  of  such  system  configuration  changes  as  the  addition 
of  on-line  terminals,  faster  or  slower  terminals,  faster  or  slower  disks, 
or  a  CPU  with  different  speed  characteristics.  Interned ly,  the  simulator 
considers  only  a  single  CPU  and  a  single  disk;  this  does  not  present  a 
major  problem,  however,  as  multiple  devices  can  be  accounted  for  by 
assumptions  concerning  their  service  time  efficiencies.  For  example, 
adding  disk  drives  and/or  controllers  can  be  reflected  through  a  decrease 
in  the  disk  service  time  parameter  of  the  input  mix,  vhich  has  the  effect 
of  speeding  up  the  simulation  of  disk  I/O.  Additionally,  many  general 
system  performance  characteristics  can  be  observed  or  validated  through 
the  use  of  this  simulator.  For  example,  use  of  the  simulator  has 
reflected  the  hypothesis  that  the  responsiveness  of  computer 
configurations  is  limited  by  the  slowest  memory  present  in  the 
configuration,  namely  the  auxiliary  disk  storage.  Decause  of  this,  it 
can  be  witnessed  that  the  disk  resources  are  heavily  utilized  in  terms  of 
the  usage  of  their  available  time.  Simulations  consistently  showed  that 
the  disk  resources  were  90-100%  utilized,  whereas  the  CPU  was  only  3-25% 
utilized.  Through  the  use  of  this  simulator,  the  interrelationships  of 
scaled  systan  configuration  items  could  be  examined. 

2.3  The  INQO  Cost  Estimation  Model 

The  INCO  life  cycle  cost  model  is  the  result  of  extensive  research 
performed  in  the  areas  of  software  engineering,  life  cycle  software  cost 
estimating,  and  scaled  systan  development  by  INCO,  INC.  The  model  is  the 
reflection  of  INCO's  oemmitment  to  develop  a  low-cost  software  life  cycle 
cost  model  for  in-house  use  on  low-cost  microprocessor  hardware. 
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2.3.1  Genesis 


INCO  began  research  and  development  of  its  own  software  life  cycle 
costing  model  in  toy  of  1979.  The  first  step  of  this  effort  included 
training  sessions  with  the  PRICE  S  and  SLIM  software  cost  estimating 
models  and  the  start  of  what  would  beocme  an  intensive  literature  search 
and  study.  In  this  phase,  INCO  personnel  absorbed  as  much  as  was 
possible  from  available  information  on  the  subjects  of  software  cost 
estimation,  commercial  software  cost  models,  and  software  life  cycle  cost 
behavior  and  management.  Published  research  which  was  found  to  be  of 
most  value  is  surmarized  in  Figure  2-04.  A  comparison  of  open- literature 
models  was  performed,  and  an  example  is  included  in  Figure  2-05. 

Along  that  point  in  time,  some  of  INCO's  other  contracted- for 
research  efforts  realized  the  need  for  sane  sort  of  cost  estimation  tool, 
however  rudimentary.  One  such  effort  was  the  Scaled  Systems  Project. 

Under  the  Scaled  Systems  effort,  INCO  was  providing  research  support 
bo  the  Rcme  Air  Development  Center  (RADC)  in  the  way  of  exploring  cost 
effective  software  development  methodologies,  particularly  in  the  areas 
of  prototype  and  scaled/prototype  developmental  systems.  As  part  of  this 
effort,  critical  cost  relationships  between  scaled  systems  and  their 
full-scale  counterparts  were  examined.  Of  specific  interest  were  the 
potential  benefits  which  could  be  derived  from  the  experience  an 
organization  would  gain  from  the  implementation  of  a  scaled  operational 
version  of  a  state-of-the-art  systsn  before  actual  development  oamienced 
on  the  full-scale  system.  Of  additional  interest  was  the  sensitivity  of 
the  forecasted  cost  benefits  to  overall  scale  factor.  This  was  the  first 
application  of  INCO's  oost  model.  To  explore  the  productivity  and  cost 
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STEP  1 


COST  MODEL  DEVELOPMENT 

Survey  Of  Published  Research 

-  Doty  &  Associates 

-  IBM's  Walston  &  Felix 

-  IEEE's  Tutorials  on  Software  Costing 

-  Maurice  Halstead's  "Software  Science" 

-  University  of  Maryland's  Comp.  Sci.  Dept.  (Vic  Basili) 

-  CACS's  Survey  of  Software  Cost  Estimating  Models 

-  Lawrence  Putnam  (SLIM) 

-  ISPA's  Newsletter  and  Proceedings 

-  RCA's  Price-S 


Figure  2-04.  Significant  Cost  Model  Literature 
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Figure  2-05.  Comparison  of  Cpen  Literature  Models 

Effort  Results 


1 


impacts  of  such  factors  as  realizing  personnel  experience  and  firmness  of 
operational  requirements,  the  Doty  model  was  exercised  about  varying 
system  sizes  in  the  context  of  developing  full-scale  systems  frcm 
built-to-scale  systems.  A  sample  of  the  model's  interactive  display  used 
for  such  analysis  is  provided  in  Figure  2-06.  This  figure  reveals  the 
cost  factors  accounted  for  by  the  Doty  model.  The  generalized  result 
frcm  the  scale  factor  sensitivity  analysis  is  portrayed  in  Figure  2-07. 

Such  exercise  proved  invaluable  to  the  development  of  the  cost 
model.  After  initial  survey  and  exercise  of  current  cost  modeling 
methodologies,  INCO  adopted  the  approach  of  synthesizing  the  best 
characteristics  of  each  model  it  had  scrutinized  into  the  one  model.  The 
theoretical  basis,  however,  remained  close  to  the  properties  outlined  by 
Lawrence  Putnam  in  his  many  research  works.  These  remaining  stejos  of 
model  development  are  summarized  in  Figure  2-C8. 

2.3.2  Foundation. 

The  basic  Putnam  model  (Figure  2-09)  was  attractive  for  a  nunber  of 
reasons.  First,  it  is  the  best  of  the  "publicized"  models  -  its  internal 
characteristics  are  defined,  outlined,  and  validated  in  print.  The 
internals  of  a  model  such  as  PRICE  S,  in  contrast,  are  very  closely  held 
by  its  inventors  and  vendor,  RCA.  Second,  the  Putnam  model  has  the  best 
facilities  for  adaptability  and  changeabil ity  through  its  technological 
constant  and  software  equation.  Third,  the  Putnam  methodology  seems  to 
be  the  best  accepted,  on  a  theoretical  basis,  and  many  other  researchers 
are  actively  exploring  its  properties,  behavior,  and  possibilities. 
Fourth,  the  possibilities  the  Putnam  model  holds  as  a  tracking/managanent 
tool  looked  premising.  This  was  especially  important  for  an  ancillary 
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From  the  Doty  t  Associates  (RADC)  Studies: 


Please  Select  an  Application  Category: 

1  -  Utility  (OS) 

2  -  Command  i  Control  (c2) 

3  -  Scientific 

4  -  Business 

5  -  All  (Others  not  listed  above) 


Selection  (1-5)? 

Estijrated  Deliverable  Source  IOC  (1,000's)? 
(S)cale,  (U)pscale,  or  (O)ption?  0 


Please  input  a  yes/no  (Y/N)  response  to  each  of  these  14  questions: 


Special  display? 

Detailed  definition  of  operational  rea'mts? 

Change  to  operational  rea'mts? 

Real  time  operation? 

CPU  memory  constraint? 

CPU  time  constraint? 

First  S/fc  developed  on  CPU? 

Concurrent  development  of  ADP  H/W? 

Time  share,  vis-a-vis  hatch  processing,  in  dev'rrent? 

Off-site  development  computer  facilities? 

Cn-site  development  computer  facilities? 

Development  computer  different  than  target  computer? 

Multi-site  development  computer  facilities? 

Unlimited  programmer  access  to  computer  facilities? 

9999.99  Man  Months  req’d  for  analysis,  design,  code,  debug,  test  and  checkout. 
(  Standard  error  on  this  approximation  *  99.9  %  ) 

Estimated  schedule  duration  *  999.99  Months 

Continue  (Y  or  N) ? 


Figure  2-06.  Example  of  INOO  Model's  Interactive  Display 
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COST /BENEFIT  ANALYSIS  (TYPICAL) 


Figure  2-07.  Scale  Factor  Sensitivity  Analysis 


STEP  1: 

-  Program  Generalized  Cost  Formulas  in  BASIC 

-  Exercise  &  Oarpare  Results 

-  Tech.  Memo;  "Scaled  Systans  Cost  Effectiveness" 

STEP  2: 

-  Putnam  Methodology  Selected  As  Most  Suitable 

For  Our  Purposes 

-  Began  Detailed  Implementation  &  Development 


STEP  3; 


-  Began  Calibration  of  Model  to  Other  Models  and 
Past  Experience 


Figure  2-08,  INOO  Cost  Model  Development 
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effort  taking  place  at  INCO  that  consisted  of  the  design  and  development 
of  a  integrated  set  of  individual  models  addressing  the  entire  scope  of 
software  development.  This  effort  is  highlighted  by  the  automated 
implementation  of  INCO's  tried  and  proven  requirements  Structured 
Organization  and  Analysis  Procedure  (SOAP)  —  namely,  the  Requirements 
Analysis  and  Tracking  System  (RATS). 

As  mentioned,  the  pcvver  of  the  Putnam-based  model  is  augmented  by 
other  models,  most  notably  those  of  Doty  [ref.  7],  Walston  and  Felix 
[ref.  24],  and  Halstead's  book,  Software  Science. 

The  Doty  model  of  cost  estimation  is  programmed  into  the  INCO  cost 
model  and  is  available  through  the  option  menu  for  use  by  the  costing 
analyst.  Experience  with  the  Doty  equations  has  produced  very  favorable 
results  by  way  of  convergence  in  calibration  attempts  with  kncvmi  cost 
data  and  the  estimates  of  other  cost  models,  namely  the  PRICE  S  cost 
estimation  mod*  1 .  Subsequently,  the  Doty  model  was  the  primary  choice 
for  estimation  purposes  under  the  scaled  systems  research  effort. 

2.3.3  Current  Stage  of  Development. 

The  current  capabilities  of  the  model  are  illustrated  in  Figure 
2-1C.  as  in  "Step  3"  of  Figure  2-08,  the  model  is  still  in  the 
calibration  and  enhancement  stages.  Thus  is  perceived  as  an  on-going 
pliase  since  a  software  cost  model  is  never  really  "done".  The  INCO  model 
was  designed  with  an  eye  for  evolution  and  adaptability  as  more  becomes 
known  about  the  science  of  software  cost  estimating  and  as  cost  estimates 
can  be  traced  through  to  their  respective  actual  costs.  Specifically, 
INCO  is  exploring  credible,  verifiable  methods  in  which  the  teclmology 
constant  can  be  more  accurately  determined.  This  has  evolved  to  a  set  of 
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Capabilities  of  INOO  Cost  Model 


adjustments  based  upon  environmental ,  product,  technological ,  and 
organizational  factors.  These  adjusting  factors  have  been  aided  by 
research  such  as  that  performed  under  the  Scaled  System  project  already 
mentioned.  A  brief  enumeration  of  these  factors  is  shown  in  Figure  2-11. 

Keeping  abreast  of  the  current  trends,  the  INCO  model  utilizes  a 
modified  version  of  Putnam's  "software  equation"  -  the  same  as  that  used 
by  SaM.'y's  SFO  [ref.  9]. 

With  the  trend  toward  better  dissemination  of  information, 
particularly  in  the  area  of  graphics,  INCO  has  already  begun  the  design 
and  development  of  general-purpose  graphics  capabilities  for  its 
microprocessor-based  hardware.  With  the  ever-increasing  advancements 
being  made  in  the  low-cost  end  of  this  hardware  market,  INCO  has  in  sight 
the  reality  of  truly  cost-effective  generalized  graphics  capabilities  and 
hopes  to  enhance  the  cost  model  with  such  facilities. 

Given  the  time  and  a  few  more  advancements  in  the  various 
technologies,  INCO  is  confident  in  its  ability  to  produce  a  true  software 
life  cycle  cost  and  cost  estimation  model  with  a  full  complement  of 
graphical  capabilities  and  on  law-cost  hardware  intended  for  in-house 
operation  and  ownership. 
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SECTION  3.  SPECIFIC  RESULTS 


The  scale  factors  and  metrics  that  have  been  defined  are  described 
in  Section  3.1,  with  further  details  to  be  found  in  the  reports  "Software 
Scale  Parameters"  and  "System  Scale  Factor  Metrics"  (Appendices  B  and  C) . 
The  interrelationships  among  scale  factors  derived  as  results  of 
experimentation  with  the  operating  system  performance  simulator  are 
discussed  in  Section  3.2.  The  basic  and  generalized  decision  factors  and 
guidelines  to  be  used  by  system  architects  in  determining  vhen  scaled 
systems  should  be  used  as  a  part  of  a  design  effort  are  discussed  in 
Section  3.3,  and  anticipated  cost  benefits  of  scaling  are  discussed  in 
Section  3.4. 

3.1  Scale  Factors 

Software  scale  parameters  are  those  aspects  of  automated  systems 
that  can  be  reduced  in  scope  in  order  to  implement  a  cost-effective 
system  scaled  with  respect  to  the  full-scale  system  objectives.  The 
development  of  a  list  of  software  scale  parameters  was  accomplished  in 
Task  1,  Subtask  1,  and  described  in  the  report  "Software  Scale 
Parameters"  (Appendix  B) .  The  categories  of  software  elements  determined 
to  be  applicable  to  scaling  were  identified  as  data  base,  performance, 
functionality,  security,  maintainability,  reliability,  language,  and 
hardware  configuration.  This  section  discusses  aspects  of  system 
development  that  contribute  to  system  cost  and  performance,  and  that  are 
amenable  to  scaling. 

3.1.1  Data  Base 

Data  base  characteristics  include  data  base  complexity  (of  access 
method  and  data  structure)  and  data  base  size  (nunber  and  length  of 


files,  nunber  of  access  keys,  nunber  arid  length  of  fields) .  Data  base 
access  complexity  may  be  scaled  by  first  employing  the  access  method  that 
vrould  be  the  simplest  for  that  size  date  base  and  then  developing  the 
scaling  relationships  involved  in  increasing  the  ocmplexity,  e.g.,  frcm 
sequential  to  indexed  sequential  to  random  access. 

Seme  data  bases  deal  with  a  relatively  snail  set  of  different  items. 
For  example,  the  data  base  for  an  inventory  control  system  might  include 
only  the  following  information:  part  nunber,  description,  quantity  on 
hand,  reorder  point,  supplier,  reorder  quantity,  and  unit  cost.  Most 
intelligence  data  bases,  on  the  other  hand,  include  a  wide  variety  of 
information,  covering  such  diverse  subjects  as  different  orders  of 
battle,  lines  of  ourmunication,  vessel  movenents,  political  and  economic 
data,  biographical  information,  etc.  Data  bases  containing  many 
different  types  of  information  are  clearly  more  difficult  to  implement 
than  are  those  limited  to  a  very  narrow  subject  area.  As  the  diversity 
of  a  data  base  increases,  development  costs  also  increase  due  to  the 
necessity  to  define  additional  data  formats  and  structures,  to  possibly 
develop  different  data  base  load  programs,  and  to  probably  implement  new 
application  programs. 

The  number  of  different  data  types  does  not,  per  se,  have  a 
significant  impact  on  performance.  As  the  nunber  of  different  data  types 
increases,  there  may  be  some  additional  overhead  to  search  directories 
for  control  records  for  specific  data  types,  but  this  overhead  is  usually 
insignificant  compared  to  that  required  to  locate  a  specific  data  item  of 
a  given  data  types.  Hence,  the  major  performance  impact  is  associated 
with  the  volume  of  data,  \nhich  might  be  expected  to  increase  as  the 
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nunber  of  different  data  types  increases.  The  main  reason  for  scaling 
the  number  of  different  data  types  relates  to  implementation  cost. 
Restricting  a  scaled  system  to  a  subset  of  the  total  nunber  of  required 
■data  types  may  reduce  the  amount  of  data  definition  required,  the  variety 
of  data  base  load  programs  necessary,  and  the  nunber  and  complexity  of 
application  programs  within  the  scaled  system 

The  amount  of  data  resident  within  a  data  base,  usually  measured  ir. 
terms  of  characters  or  records,  tangentially  impacts  cost  and 
significantly  impacts  performance.  Neglecting  all  factors  other  than 
data  volume,  it  should  theoretically  be  just  as  simple  to  implement  a 
large  data  base  as  a  small  one.  A  data  base  management  system  and  the 
related  application  programs  should  be  capable  of  handling  any  volume  of 
data  required  by  a  system.  Iiowever,  performance  impacts  of  data  volume 
dictate  that  additional  sophistication  be  implemented  for  processing 
large  data  bases  than  for  snail  ones,  in  order  to  maintain  an  acceptable 
level  of  performance.  For  a  small  data  base,  therefore,  a  sequential 
file  organization  may  be  adequate.  To  achieve  acceptable  performance 
from  a  large  data  base  system,  licwever,  a  more  complex  data  storage 
technique,  such  as  a  hierarchical  or  network  structure,  is  usually 
required.  The  additional  complexity  required  by  additional  data  volume 
obviously  adds  to  the  cost  of  large  data  baee  systems. 

While  not  direct  software  implenentation  costs,  additional  life 
cycle  management  oosts  are  incurred  by  large  data  bases.  The  initial 
process  of  loading  a  large  data  base  will  cost  more  than  that  for  a  small 
one,  due  to  additional  data  conversions,  consistency  corrections,  and. 
possibly  manual  entry  required.  Maintaining  a  larye  data  base  is  also 
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more  costly  than  maintaining  a  small  one,  due  to  the  amount  of  checking 
that  must  be  continually  performed  to  establish  and  maintain  the 
integrity  of  the  data. 

As  mentioned  above,  the  performance  of  a  data  base  system  can  be 
expected  to  decrease  as  the  volume  of  data  increases.  The  amount  of 
performance  decrease  is  dependent  on  the  sophistication  of  the  data 
access  techniques  employed .  For  example,  performance  of  sequential  data 
bases  will  degrade  significantly  as  data  volume  increases.  On  the  other 
hand,  performance  of  hierarchical  data  bases  may  not  be  perceptibly 
influenced  by  wide  variations  in  data  volume,  provided  that  the  types  of 
requests  made  upon  the  data  base  follow  the  established  hierarchy. 
Performance  on  requests  that  require  searching  of  the  entire  data  base  or 
significant  portions  thereof,  will  degrade  markedly  with  increases  in 
data  volume  regardless  of  the  data  base  structure  employed.  The  major 
objective  in  scaling  data  base  volume  is  to  simplify  the  implementation 
of  a  data  base  system.  Reducing  the  volime  of  data  naturally  simplifies 
loading  a  scaled  data  base.  In  addition,  less  sophisticated  data  storage 
techniques  can  be  used  with  reduced  amounts  of  data.  In  extrapolating 
ultimate  system  performance  frem  scaled  system  performance,  allowances 
must  be  made  for  any  additional  data  access  sophistication  to  be 
implemented,  as  well  as  for  performance  impacts  of  increased  data  volume. 

With  additional  data  access  sophistication  included  in  the  ultimate 
system,  its  performance  may  be  equal  to  or  better  than  that  of  a  scaled 
system,  even  though  the  volume  of  data  is  dramatically  increased. 

Data  base  conceptual  complexity  is  used  here  to  denote  the  degree  to 
which  the  data  elanents  within  a  data  base  are  mutually  interdependent. 


A 
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Conceptually  simple  data  bases  contain  data  which  dc  not  depend,  to  any 
great  degree,  on  other  data  within  the  data  base.  For  example,  a  data 
base  used  by  a  magazine  publisher  may  include  data  on  subscribers, 
advertisers,  contributors,  and  production  mechanics  (ink  and  paper 
inventories,  etc.).  These  four  types  of  data  bear  no  relation  to  each 
other.  On  the  other  liand,  an  intelligence  data  base  might  contain  data 
on  enemy  weapon  positions,  technical  weapon  characteristics,  friendly 
installation  locations,  and  intelligence  sources.  Enemy  weapon  positions 
are  correlated  with  technical  weapon  characteristics  to  determine  their 
threat  to  friendly  installation  locations.  All  data  is  also  correlated 
according  to  the  intelligence  sources.  This  is  an  example  of  a 
conceptually  oanplex  data  base,  with  many  types  of  information  dependent 
on  other  types.  A  conceptually  complex  data  base  is  far  more  expensive 
to  implement  than  is  a  conceptually  simple  one.  Data  structures  must  be 
designed  that  permit  rapid  correlation  of  different  types  of  information, 
and  applications  must  be  designed  to  maintain  the  integrity  of  all  data 
interrelationships.  Conceptually  complex  data  bases  will  typically  not 
perform  as  well  as  comparable  conceptually  simple  ones.  Extensive  data 
correlations  require  additional  data  base  accesses,  as  woll  as  data  base 
storage  overhead  to  maintain  efficiency. 

Many  data  correlations  can  be  un implemented,  implanented  via  manual 
means,  or  implemented  through  a  semi-automatic  technique  such  as  multiple 
queries  with  intermediate  hit  files  for  a  scaled  system.  This  can 
significantly  reduce  the  cost  of  implementing  a  scaled  system.  Relative 
performance  of  the  scaled  and  ultimate  systems  would  depend  on  many 
implementation  factors.  The  cost  of  implementation  complexity  is 
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generally  dependent  on  the  underlying  conceptual  complexity  of  the  data. 
For  conceptually  simple  data  bases,  a  complex  implementation  will 
generally  be  more  expensive  than  a  simple  implementation.  The  reason  for 
this  is  that  a  simple  implementation  would  suffice  to  fit  the  data 
definition,  and  adding  complexity  tends  to  increase  cost.  (A  complex 
implementation  may  be  required,  however,  due  to  the  performance 
considerations  noted  above,  based  on  data  volume.)  For  conceptually 
complex  data  bases,  a  simple  implementation  will  generally  be  more 
expensive  than  a  complex  one.  This  is  because  all  application  programs, 
with  a  simple  data  structure,  must  be  aware  of  the  canplexities  of  the 
data  relationships.  With  a  ocmplex  implementation,  a  sophisticated  data 
base  management  system  typically  relieves  the  application  programs  frar. 
consideration  of  rrany  of  the  conceptual  canplexities.  Cost  aspects  are 
clearly  dependent  on  the  number  of  application  programs  required,  the 
degree  to  which  the  data  base  management  system  can  insulate  the 
application  programs  frcm  the  conceptual  canplexities,  and  whether  a  data 
base  management  system  can  be  used  intact  or  mist  be  specially  developed 
or  modified.  A  complex  implementation  of  a  data  base  will  generally 
yield  better  performance  than  will  a  simple  implementation.  This  is 
because  direct  access  techniques  {directories  and  hashing)  improve  data 
access  times,  and  pointers  or  links  between  records  speed  the  processing 
of  data  interrelationships.  There  is,  however,  a  point  beyond  which 
additional  implementation  complexity  becomes  overkill  for  the  underlying 
conceptual  complexity  and  data  volume.  Past  that  point,  the  overhead 
required  to  maintain  seldcmly-used  directories  or  links  may  begin  to 
degrade  performance.  In  any  event,  any  implementation  complexity  must  be 
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carefully  designed  to  parallel  the  conceptual  complexity,  thus  improving 
performance  for  the  precise  uses  to  vhich  a  data  base  will  be  put. 

Since  a  scaled  systan  need  not  support  the  conceptual  complexity, 
data  volume,  or  performance  of  an  ultimate  system,  data  base 
implementation  complexity  is  very  amenable  to  scaling.  Using  a  simple 
implementation  methodology  will,  in  general,  result  in  significant  cost 
savings,  provided  that  conceptual  complexity  is  likewise  scaled.  Thus,  a 
series  of  simple  flat  files,  without  complex  data  dependencies,  might  be 
used  in  a  scaled  system  instead  of  a  complex  hierarchical  or  network 
structure.  Estimating  ultimate  performance  based  on  such  a  scaled  system 
requires  detailed  analysis  of  the  advantages  gained  by  going  to  a  more 
oamplex  implementation  philosophy. 

Same  forms  of  data  lend  themselves  very  readily  to  proven  data  base 
technology,  whereas  other,  more  exotic,  data  forms  are  still  being 
investigated  for  efficient  exploitation  within  a  data  base.  For  example, 
a  data  base  of  bank  transactions  contains  wcl 1-defxned  data,  constructed 
in  accordance  with  fairly  rigid  formats,  and  subject  to  easily  expressed 
validity  checks.  Becoming  slightly  mor<_-  exotic,  a  data  bc;se  of 
bibliographic  information  contains  much  English  language  text.  Manv  such 
data  bases  have  been  constructed,  but  research  is  still  underway  on 
improving  the  effectiveness  and  efficiency  of  such  data  bases.  At 
perhaps  the  most  exotic  extreme,  several  research  programs  within  the 
intelligence  ccnmunity  are  currently  examining  ways  of  using  data  bases 
of  digitized  imagery.  Such  data  bases  would  contain  enormous  volumes  of 
data,  and  would  require  special  algorithms  to  effectively  distill 
information  from  the  imagery  data. 
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The  expense  of  implementing  a  data  base  increases  as  the  data 
within  it  deviates  further  and  further  from  forms  normally  stored  within 
conventional  data  bases.  This  is  primarily  due  to  two  factors.  First, 
conventional  data  usually  lends  itself  to  easily-defined  structures, 
whereas  efficient  structures  and  even  expected  access  criteria  for 
unconventional  data  are  usually  difficult  to  define.  Second,  the 
algorithms  for  manipulating  conventional  data  have  been  implemented  many 
times  and  are  well-understood,  while  the  algorithms  for  manipulating 
unconventional  data  are  often  the-  subject  of  ongoing  research  and 
development.  The  net  result  of  these  two  factors  is  that  implementation 
of  conventional  data  bases  can  proceed  in  a  straightforward  manner  frcm 
design  with  little  risk,  whereas  implementation  of  exotic  data  bases 
often  includes  many  design  changes  and  continual  experimentation,  with 
the  attendant  high  cost  and  risk. 

The  structuredness  of  conventional  data  forms  lends  itself  to 
efficient  implementations  of  such  data  bases.  As  mentioned  above, 
efficient  structures  and  expected  access  modes  are  often  not  known  for 
the  more  exotic  forms  of  data.  This  naturally  leads  to  difficulties  in 
impltanenting  good  performance  for  data  bases  containing  such  data.  Since 
the  use  of  unconventional  data  forms  greatly  increases  cost  and  reduces 
performance,  emitting  such  data  frcm  a  scaled  syston  will  certainly  make 
it  much  easier  to  implement.  However,  one  of  the  reasons  for  building  a 
scaled  version  of  a  system  requiring  exotic  data  forms  will  usually  be  to 
prove  the  feasibility  of  processing  such  data.  Hence,  the 
conventionality  of  data  forms  would  typically  not  be  scaled,  with 
economies  of  scaled  system  implenentation  realized  elsewhere. 
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3.1.2  Performance  Indices 


The  classes  of  quantitative  performance  indices  identified  for 
scaling  are  productivity,  interactive  responsiveness,  utilization  and 
operating  system  organization.  Productivity  is  cauposed  of  the  amcunt  of 
work  that  can  be  physically  accomodated  and  the  rate  at  vAiich  it  is 
ultimately  accanpl ished .  The  amount  of  work  can  be  measured  by  deriving 
the  system  capacity,  the  amount  of  information  it  can  contain  at  any 
given  period  of  time,  as  well  as  the  capacity  of  the  hardware  components. 
The  throughput,  the  average  rate  at  which  jobs  are  ocmpleted  by  the 
system  in  a  given  interval  of  time,  is  a  result  of  nearly  every  aspect  of 
a  system  configuration;  frcm  the  hardware  itself  to  the  functions  the 
system  is  required  to  perform  to  the  typica1  set  of  jobs  requiring  system 
resources,  i.e.,  the  job  mix.  The  scaled  system  design  would  have  a 
scaled  system  capacity  as  well  as  a  scaled  job  mix,  structured  for 
optimum  performance.  These  factors  all  contribute  to  interactive 
responsiveness,  the  number  of  responses/unit  time,  the  inverse  of  the 
time  between  the  presentation  of  an  input  to  the  system  and  the 
appearance  of  the  corresponding  output.  Because  this  parameter  is 
difficult  to  predict  on  the  front-end  of  the  implementation  phase,  it 
will  usually  be  quantified  through  observation.  That  is,  a  response  time 
may  be  set  as  a  target.  The  scaled  systan  might  reveal  that  the  chosen 
design  does  not  produce  the  required  responsiveness .  The  full-scale 
system  design  specification  could  then  be  cost-effectively  adjusted  in 
the  front-end  of  the  design  cycle,  where  economic  leverage  is  the 
greatest . 
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Utilization  is  defined  as  the  ratio  of  the  time  a  specified  part  of 
the  system,  is  used  to  a  given  interval  of  time.  Modules  may  be  linearly 
scaled  as  the  ratio  between  proposed  and  actual  module  utilization,  where 
scale  factors  are  in  terms  normal  for  the  module,  e.g.,  memory 
utilization  is  measured  as  a  percentage  of  total  memory  available. 

Operating  system  organization  subelements  were  identified  in 
"Software  Scale  Parameters"  (Appendix  R)  as  processing  mode,  operating 
system,  and  interrupt  processing.  These  parameters  represent  a  mode  of 
operation  rather  than  a  measurable  ratio  and  thus  are  difficult  to 
quantify.  However,  the  choice  of  one  mode  over  another  is  a  valid  method 
to  scale  performance.  Scaling  system  aspects  applicable  under  this 
category  would  undoubtedly  be  highly  case-dependent  and  quantifying  the 
factors  largely  subjective. 

3.1.3  Functionality 

The  approach  to  scaling  functionality  consists  of  reducing  the 
variety  of  functions  supported  or  reducing  the  functional  complexity. 
The  first  method  entails  vertical  functional  scaling  (eliminating 
subsystems);  the  second,  horizontal  functional  scaling. 

3.1.4  Security 

Consider  next  the  scaling  of  security  functions.  The  degree  of 
security  provided  for  software  and  data  is  determined  by  the  scope  of 
access  control,  those  attributes  of  software  that  restrict  access  to  and 
manipulation  of  programs  and  data,  and  the  completeness  of  access  audit, 
the  procedure  whereby  an  historical  record  is  maintained  of  both 
successful  and  unsuccessful  attaripts  to  access  restricted  data.  Security 
may  be  considered  a  valid  parameter  for  scaling  when  the  scaled  system 
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will  be  developmental  in  nature  and  when  either  adequate  physical 
safeguards  may  be  substituted  for  the  full-scale  software  security 
procedures  or  the  data  to  be  protected  is  simulated  or  is  non-sensitive 
public  test  data. 

The  basic  goal  of  data  base  security  is  to  prevent  information  frcm 
falling  into  the  hands  of  individuals  not  authorized  to  receive  it.  TWo 
major  questions  must  be  answered  in  designing  a  data  base  security 
system:  Hew  shall  it  be  decided  who  has  access  to  information,  and  what 
is  the  anallest  unit  of  information  to  which  access  will  be  controlled? 

The  first  question,  that  of  determining  individual  access  rights, 
has  predominantly  been  answered  through  two  different  approaches,  by  user 
or  by  classification.  The  two  approaches  are  sometimes  also  used 
together .  The  scheme  controlling  access  by  user  effectively  tags  each 
item  to  which  access  is  controlled  with  a  list  of  those  users  allowed 
access  to  the  item.  Users  requesting  access  to  an  item  must  be  on  the 
list  for  that  item  in  order  to  be  permitted  access.  The  scheme 
controlling  access  by  classification  tags  each  item  to  which  access  is 
controlled  with  the  item's  security  classification,  special  handling 
instructions,  releasability,  and  so  on.  Each  user,  and  perhaps  terminal, 
has  permission  to  access  data  with  only  certain  security  classifications, 
special  handling  instructions,  and  releasabilities.  The  system  compares 
user  access  privileges  with  the  classification  of  a  requested  data  item 
before  granting  access. 

The  second  question,  that  of  the  size  of  units  of  information  to 
which  access  is  controlled,  has  also  been  answered  in  several  ways. 
Virtually  all  systems  control  access  at  the  system  level,  with  user 
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sign-on  password  authentication.  Most  systems  control  access  to 
individual  files  in  sane  fashion,  and  many  systems  erven  control  access  to 
individual  records  within  files.  Sane  systems  go  so  far  as  controlling 
access  to  individual  fields  within  records. 

Related  to  security  is  the  requirement  to  maintain  an  audit  trail  of 
all  operations  taken  against  the  data.  This  audit  trail  normally 
contains  more  information  than  the  transaction  log  maintained  by  a  data 
base  management  system  to  support  data  integrity.  Preserving  data 
integrity  requires  logging  of  only  update  transactions,  whereas  a 
security  audit  trail  also  requires  recoding  of  all  data  read  frcm  a  data 
base  as  well.  The  degree  to  which  security  audit  trails  are  implanented 
for  typical  intelligence  systems  varies.  Virtually  all  systems  record 
user  sign-on  and  sign-off.  Many  systems  also  record  major  function 
invocation.  Almost  no  systems  record  the  actual  data  manipulated  by 
users.  Other  aspects  of  system  security  include  accreditation  for 
operation  with  classified  information  and  the  problems  of  obtaining 
cleared  programmers  and  facilities. 

A  security  system  can  be  considered  scaled  if  it  encompasses  a  file 
protection  methodology  less  restrictive  than  the  full-scale  system.  This 
scaling  can  take  the  form  of,  for  example,  a  less  sophisticated  level  of 
file  protection,  a  smaller  access  matrix,  e’imination  of  codewords,  audit 
trails,  encryption,  and/or  simplification  of  the  authentication 
mechanism. 

3.1.5  Maintainability 


Maintainability  is  defined  as  the  probability  that,  when  maintenance 
action  is  initiated  under  stated  conditions,  a  failed  system  will  be 


restored  to  an  operable  condition  within  a  specified  tint.  It  also 
refers  to  the  effort  required  to  locate  and  fix  an  error  in  an 
operational  program  and  is  a  technically  valid  area  for  scaling,  since 
the  implementation  of  maintainability  involves  increased  software 
development  cost  and/or  time.  Maintainability  is  a  function  of  the 
capabilities  included  in  the  system,  the  skill  level  of  the  personnel, 
and  the  support  facilities  (locally  available  tools  and  diagnostic  test 
equipment  or  aids,  spare  parts  or  alternative  program  versions  or  back-up 
files).  Since  scaling  of  this  parameter  would  involve  the  elimination  or 
simplification  of  functional  requirements  of  the  system,  the  approach 
would  be  similar  to  that  for  scaling  functionality.  However,  eliminating 
modules  whose  purpose  is  to  enhance  maintainability  may  indeed  prolong 
rather  than  enhance  the  progress  of  the  project.  Such  considerations 
must  be  emphas'_aa  when  scaling  is  contemplated. 

Among  the  maintenance  modules  which  could  be  sca;ed  are  process 
error  handling  (minimize  the  number  of  conditions  to  be  checked), 
restart/recovery  procedures,  data  correction,  fault  detection/ trap 
software,  monitors  of  system  performance,  and  back-up  procedures. 
Development  and  diagnostic  aids  such  as  program  tracers  and  interactive 
debuggers  might  actually  be  added,  tc  reduce  the  development  effort  of 
the  full-scale  system. 

3.1.6  Reliability 

Reliability  can  be  defined  as  the  probability  of  satisfactory 
performance  for  a  given  time  when  used  under  stated  conditions,  the 
metric  being  defined  as  the  number  of  failures/time.  A  software  failure 
is  an  occurrence  of  a  soliware  error,  when  the  software  does  not  do  what 
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the  user  reasonably  expects  it  to  do.  In  order  to  prevent  failures  from 
occurring  in  the  first  place,  a  certain  amount  of  redundancy  is  built 
into  systems  such  that  automatic  diagnosis  and  recovery  can  be 
accomplished  by  the  software  itself  without  operator  attention  or 
intervention.  This  redundancy  requires  additional  design,  system 
storage,  programming,  and  effort;  thus  reliability  may  be  scaled  with 
respect  to  these  aspects. 

Some  reliability  elements  amenable  to  scaling  would  include 
precision,  error  detection  software  (eliminate  software  geared  to  errors 
which  would  occur  infrequently  in  practice  or  not  at  all  in  the  input  to 
the  scaled  system) ,  approximation  algorithms  (use  fast,  easy,  not  as 
accurate  as  possible  approximation  functions  and  algorithms),  and  coding 
standards.  Relaxation  in  enforcement  of  coding  standards  might  only  be 
considered  where  recoding  would  be  necessary  to  implement  the  full 
system.  If,  however,  the  scaled  system  will  forn  the  basic  structure  for 
the  full  system,  then  strict  coding  standards  should  be  maintained. 

3.1.7  Programming  Language 

Two  aspects  of  programming  language  suitable  for  scaling  are 
language  selection  and  implementation.  A  language  that  is  optimal  for 
the  scaled  system  but  different  frcm  the  cne  chosen  for  the  full-scale 
system  might  be  selected  if  the  target  system  is  to  be  completely 
recoded.  Such  a  language  might  be  chosen  based  upon  considerations  of 
top-down  design,  code  readability,  and  modifiability,  thereby 
contributing  to  accelerated  program  development.  Scaling  language 
implementation  could  be  accomplished  fcy  successively  enhancing  a  baseline 
subset  of  the  language  being  implemented. 
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3.1.8  Hardware  Configuration 


The  choice  of  individual  hardware  components  and  their  configuration 
is  an  important  aspect  of  the  scaled  systems  methodology.  Significant 
savings  in  schedule,  effort  and  cost  may  be  achieved  by  reconfiguring  the 
target  system  hardware  or  by  selecting  an  alternate  operational 
envirorment  for  the  scaled  systems  effort. 

In  order  to  reduce  complexity,  either  the  number  of  component  types 
or  the  total  number  of  components  may  be  scaled,  both  approaches  reducing 
total  system  complexity.  Factoring  hardware  configuration  is  simplified 
by  the  nature  of  the  entity  itself,  due  to  the  numerically  descriptive 
nature  of  hardware.  Seme  hardware  el  alien  ts  that  it  might  be  possible  to 
scale  are:  number  of  CPU's  (scale  from  multiprocessing  to  a  single 
processor),  number  and/or  type  of  peripherals,  size  of  the  instruction 
set  of  a  CPU,  input  devices  (simulate  the  data  instead),  number  of 
ocnmunications  nodes,  complexity  of  communications  network  or  hierarchy 
(lower  the  number  of  linkages  among  nodes),  and  level  of  service  to 
peripherals  (eliminate  prioritized  service). 

3.1.9  Simulator  Variable  -  Scale  Factor  Relationships 

The  scale  factors  introduced  in  the  report  "System  Scale  Fhctor 
Metrics"  (Appendix  C)  influence  and  interrelate  with  each  other  in 
complex  ways  that  can  be  quite  different  for  different  operating  regimes 
(e.g.,  disk-limited,  CPU-limited)  of  the  1DHS  being  modeled.  In  order  to 
scale  a  system,  ways  are  needed  of  predicting  changes  in  system 
characteristics  as  the  scaled  parameters  vary,  even  when  the  variations 
are  large  enough  to  place  the  IDHS  into  a  different  operating  region. 
For  example,  if  the  scaled  systan  has  a  factor  of  four  fewer  terminals 


than  the  envisioned  full-scale  system,  it  is  necessary  for  the  system 
designer  to  know  how  system  throughput  will  degrade  when  the  systan  is 
scaled  up  and  terminals  are  added.  1ve  report  "Interrelationships  Among 
Scaling  Factors"  (Appendix  D)  addresses  these  considerations. 

Figure  3-01  illustrates  the  throughput  and  CPU  speed  functional 
relationship,  dononstrating  that  scaling  a  given  system  parameter  will 
not  necessarily  affect  another  parameter  in  the  same  way  in  all  operating 
regions;  i.e.,  increasing  CPU  speed  in  a  CPU-limited  regime  will  affect 
throughput  significantly,  while  in  a  disk-limited  region,  it  will  have 
very  little  effect.  It  can  be  said  then  that  the  relationships  between 
parameters  are  complex  and  non-linear.  It  is  not  possible  to  write  down 
analytic  expressions  that  will  hold  under  all  conditions. 

In  order  to  provide  the  system  designer  with  the  tools  that  will 
enable  him  to  predict  performance  under  the  wide-range  of  scaling 
conditions  that  are  encountered  in  practical  situations,  a  concept  was 
evolved  that  uses  a  simulation  model  of  a  generalized  IDHS  to  predict 
performance  and  to  predict  changes  in  one  variable  fran  changes  in 
another;  i.e.,  the  simulation  substitutes  for  the  nonexistence  of  precise 
analytic  functional  relationships  between  various  scaled  parameters. 

It  has  been  found  that  a  fairly  small  nunber  of  parameters  is 
adequate  to  specify  each  particular  IDHS  to  the  simulation.  Each  of  the 
input  parameters,  in  turn,  can  be  expressed  as  a  fairly  simple  analytic 
function  of  the  scaling  parameter  factors.  A  series  of  formulae  are  used 
in  steps  to  relate  the  simulator  variables  to  scale  factors.  A  diagram 
of  the  technique  is  shewn  in  Figure  3-02. 
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Figure  3-01.  ihroughput  -  CPU  speed  functional  relationship 


Figure  3-02.  Pelatmg  Scale  Factors  tp  Simulation  Variables 
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As  shown  in  Figure  3-03,  an  exanple  of  a  simulator  input  vanatle 


is  CPU  service  time,  i.e.,  time  in  CPU  per  CPU  block,  where  a  block  is  a 
set  of  instructions  until  a  disk  access  is  encountered.  The  following 
formulae,  as  described  in  the  report  "Simulator  Variable-Scale  Factor 
Equations"  (Appendix  D  ),  are  one  set  that  can  be  used  to  relate  CPU 
service  time  to  system  scale  factors. 

CMEaN  can  be  defined  as  follows: 

CMEAN  =  instructions  executed/block 
instruction/ time  (power) 

Define  the  following  terms: 

Np  =  number  of  disk  accesses 

=  N  (number  of  data  base  accesses)  +  Npp  (number  of  paging 
DB  disk  accesses) 

I  =  number  of  instructions 

=  (number  of  computational  instructions)  + 

InR  (number  of  data  base  instructions)  +  Ip  (number  of  paging 
u“  instructions) 


Define  the  frequency  of  data  base  disk  accesses  per  computational 
instruction, 

fDB  =  NDB 

Then  Ndb  =  Ic  [Ndb]  =  Ic  FDB 
Also  Npp  =  Kp  *  Ic  *  where  Kp 

CR, 

is  a  system-dependent  constant  calculated  as  the  number  of  piping 
accesses/computational  instruction,  C y  is  the  virtual  core  for  a 
particular  job  (the  job  size)  and  CB  is  the  real  core  for  the  job  (the 
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actual  core  available  for  the  job) . 

Then 

nD  =  nDB  +  NDP 

Cy 

nd  =  Ifdb  +  Kp  Cr  ^ 

i.  =  _ i. - 

Nd  Fdb  +  Kp  Cy 

1^  is  the  number  of  instructions  per  block  so, 

^EAN  =  [F*db  +  Kp  £v]  “1 

_ _ CR-  . 

instructions/time 

To  find  a  value  for  F^g,  estimates  and  typical  numbers  will  be 
sought.  The  value  will  depend  on  the  function  being  performed  and  the 
probability  of  having  to  make  a  disk  access.  There  are  several  factors 
that  affect  the  probability  that  a  piece  of  information  is  in  core  vs.  on 
disk,  such  as  the  amount  of  the  data  base  that  is  stored  in  core  at  any 
time,  the  organization  of  data  on  the  disk  (the  data  base  structure),  and 
the  data  manipulation  algorithms.  Also,  since  FDB  was  defined  as  NdB  , 
it  may  be  possible  to  calculate  Nqb  for  a  giver,  function  and  data-base 
organization,  while  1^  v/ould  also  be  a  function  of  scale  factors,  such  as 
the  function  being  performed  and  the  data  base  size  and  data  base 
ccmplexity.  Thus  Fj-^  could  be  derived  in  this  way. 

A  \«ay  to  approach  the  problem  of  evaluation  might  be  to  start  with 
"reasonable"  estimates  for  these  parameters.  When  the  scaled  system  is 
operational,  they  can  be  measured  by  monitoring  system  behavior.  Indeed, 
the  purpose  of  building  the  scaled  systan  is  to  measure  the  parameters 
vhich  will  be  used  in  the  full-scale  system  so  that  flexibility  in  the 
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design  of  the  full-scale  system  can  be  retained.  The  small  scale  system 
together  with  the  simulation  will  enable  the  designer  to  see  '^iat  will 
work  in  the  full-scale  system. 

To  find  Cy  for  each  job  of  type  j,  assume  input  values  for 
simulation  parameters  mean  Cy(j) ,  (J  Cy  (j)  .  As  the  job  begins,  pick 

the  actual  Cy  according  to  a  probability  di°i_ribution  function. 

For  Cr  ,  the  real  available  core  for  the  job,  the  following 
system-dependent  values  can  be  input: 

Gp  =  total  core  for  the  machine 

Cos  =  operating  system  core 

Then 

cR  =  Cp  -  Cgg  =  Op  -  Cqs 

number  of  jobs  running  number  of  terminals. 

In  the  equations  that  have  been  discussed,  the  simulator  parameters 
have  been  defined  as  functions  of  many  of  the  scale  factors,  including 
power  ( instructions/time) ,  number  of  terminals,  real  core  (system 
capacity),  number  of  instructions  (related  to  data  base  complexity  and 
structure),  hardware,  functionality,  and  security  core  (involved  in  the 
calculation  of  CR,  the  real  available  core  for  a  job) .  The  use  of  the 
simulation  then  permits  analysis  of  scale  factor  interrelationships. 

The  simulation,  as  described  in  the  report  "Simulator  Description" 
(Appendix  E),  will  be  used  by  the  systam  designer  in  an  iterative  manner 
in  the  course  of  specifying  the  full-scale  system.  The  scaling  factors 
will  be  specified  and  used  as  input  to  the  simulation,  the  output  will  be 
examined,  and  scaling  will  be  respecif ied  until  the  desired  outputs, 
i.e.,  full-scale  system  behavior,  are  achieved.  A  typical  question  vxauld 
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be:  How  much  can  the  data  base  size  be  scaled  up  with  present  disk 

hardware  without  going  below  the  minimum  required  responsiveness 
(responses  per  unit  time)?  Will  it  be  necessary  to  have  more  and/or 
faster  disks  in  order  to  achieve  the  desired  full-scale  system 
responsiveness  and  incorporate  the  necessary  data  base  size?  If  access 
time  is  improved  by  so  much,  hew  much  can  the  data  base  then  be  scaled 
up?  The  system  designer  will  look  at  the  results  of  the  simulation  based 
on  a  set  of  values  for  the  scaling  parameters  and  iteratively  adjust 
these  values.  Such  respecifications  of  scaling  may  well  result  in  design 
changes  for  the  full-scale  system,  e.g.,  by  going  to  more  and/or  more 
powerful  hardware.  Thus  the  tools  to  be  used  will  be  the  simulator  and 
the  set  of  input  variables. 

3.2  System  Design  Methodology  for  Using  Scaled  Systems 
3.2.1  Interrelationships  Among  Scaling  Factors 

The  objective  of  this  task  was  to  develop  standard  procedures  for 
applying  the  scaled  system  technique  to  new  IDHS  development  projects  and 
to  describe  how  measurements  made  on  a  scaled  system  can  be  extrapolated 
into  predictions  for  a  full-scale  systan. 

The  result  of  Task  1  was  a  set  of  scale  factors  that  describe  an 
I  CHS,  with  appropriate  metrics  defined  on  them.  Ihe  goal  of  Task  2  was 
to  determine  the  predictive  value  of  each  scaled  parameter  before 
preparing  guidelines  for  which  parameters  to  scale.  Toward  this  end,  an 
operating  system  performance  simulator  was  designed,  as  discussed  in  the 
report,  "Simulator  Description"  (Appendix  F) . 

A  job  enters  the  systan  at  rardcm  intervals  chosen  fran  a  Poisson 
distribution,  from  one  of  n  terminals  (n<100).  The  exponential 
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probability  distribution  function  is  used  to  model  the-  job  arrivals 
because,  as  noted  in  Beizer  [ref.  6],  assigning  this  distribution  is 
equivalent  to  saying  that  the  arriving  customers  individually  and 
collectively  behave  as  if  they  were  not  aware  of  each  other's  existence, 
because  it  is  usually  (but  not  always)  pessimistic,  and  because  it  leads 
to  reasonable  expressions  for  the  queueing  narameters.  The  use  of  the 
exponential  interarrival  time  distribution  leads  to  a  Poisson  arrival 
rate  distribution. 

The  jcb  is  assigned  a  job  class  (CFU-  or  disk-bound)  c^nd  a  CFU/disk 
iteration  count,  based  on  probability  distributions.  Each  terminal  is 
assigned  a  wait  or  "think"  time.  The  job  is  »  t.‘<  •T’  ;-r  iis- 

queue  if  the  required  facility  is  busy;  when  it  g.ont  -  ft:*  , 

it  is  assigned  a  CPU  service  time  and  a  disk  s*  r.  :>  *  •  . 

Experiments  with  sets  of  various  pe.r.irv-**  r  v  d  >  * 
order  to  address  the  issue  of  how  such  fi-  m-  •  >  •  .  :;■!>;  f 

terminals,  the  job  mix  (the  combination  of  '1’  -U  ■■  .1 .  •  : . -* -it  end 

jobs),  and  average  CPU  service  time  affect  5  is*  w. :  •  11  .  ♦  ;r*  .  .n: 

disk  utilization,  response  time,  and  otn*t  m.  c  ;» *-s  -  f  system 

performance.  Tests  were  run  with  8,  1C,  It,  id,  rV,  and  It’.' 

terminals,  with  the  percentage  of  CPU-bound  jobs  ramune  fi  r  b'i  to  94";. 

Figure  3-G4  summarizes  the  experimental  results  fet  t.-&t  s  involving  me  a. 
CPU/disk  iteration  aount  (the  number  of  times  tht  jd  fl  ij>s  brtwt  er.  the 
CPU  and  the  disk)  of  from  4  to  10  for  CPU-bound  jobs  and  1?  to  30  for 
disk-bound  jobs. 

It  is  necessary  to  examine  the  parameter  interrelationships  and 
system  behavior  in  different  operating  regions,  i.e.,  CFU-  or  disk-bound. 
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Figure  3-05  plots  the  nunber  of  terminals  against  the  response  time  (the 
inverse  of  the  responsiveness  scale  factor)  for  a  system  that  has 
approximately  15%  of  its  jobs  in  the  CPU-bound  category.  The  graph  shews 
that  for  a  mean  of  4  CPU/disk  iterations  for  CPU -bound  jobs  and  a  mean  of 
12  CPU/disk  iterations  for  disk-bound  jobs,  the  increase  in  response  time 
is  close  to  being  proportional  to  the  increase  in  the  number  of 
terminals,  i.e.,  increasing  the  nunber  of  terminals  by  a  factor  of  2.5 
(20  to  50)  results  in  a  2.2  fold  increase  in  the  response  time,  vhile 
doubling  the  number  of  terminals  frem  50  to  100  results  in  a  factor 
increase  of  approximately  1.8  in  the  response  time.  As  the  mean  number 
of  CPU/disk  iterations  increases,  the  increases  in  response  time  for  the 
higher  number  of  terminals  are  sharper,  as  can  be  seen  from  Figure  3-C5. 

Figure  3-06  shows  the  number  of  terminals  plotted  against  the 
response  time  in  the  operating  region  where  25%  of  the  jobs  are 
CPU-bound.  The  curve  fb^tC^=4yC<«y=12,  fbllcws  the  same  pattern  as  the 
15%  CRJ-bound  curve  in  Figure  3-05;  i.e.,  the  sharp  increases  in  response 
time  take  place  when  the  mean  nunber  of  CPU/disk  iterations  is  highest. 

Similar  phenomena  are  demonstrated  in  Figure  3-07  ir.  the  region 
where  50%  of  the  jobs  are  CRJ-bound  and  in  Figure  3-08  where  10%  and  15% 
of  the  jobs  are  CPU-bound.  The  general  conclusion  illustrated  by  the 
results  of  these  experiments  is  that  response  time  increases  as  the 
number  of  terminals  increases,  with  proportionately  larger  increases 
taking  place  at  the  higher  range  of  nunber  of  terminals  and  in  the 
regions  uhere  more  jobs  are  disk -bound;  i.e.,  the  curves  tend  to  flatten 
as  the  percentage  of  CPU-bound  jobs  increases.  In  addition,  as  might  be 
expected,  response  time  increases  as  the  mean  number  of  CPU/disk 
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Figure  3-06.  Number  of  Terminals  Vs.  Average  Response  Time  with  25%  CPU-bound  Jobs 


iterations  increases.  The  rate  of  increase  is  not  predictable  however. 
Figure  3-09  summarizes  the  changes  in  response  times  as  the  percentage  of 
CPU-bound  jobs  decreases,  i.e.,  the  system  becomes  more  disk-bound.  It 
can  be  seen  that  doubling  the  number  of  CRJ/disk  iterations  does  not 
consistently  double  the  response  time,  and  the  increases  in  response  time 
vary  within  operating  regions.  Figure  3-10  sunrnarizes  the  performance 
curves  in  Figures  3-05,  3-06,  and  3-C7. 

Figure  3-11  demonstrates  the  system's  behavior  in  the  region  where 
25%  of  the  jobs  are  CPU-bound  and  the  number  of  CPU/disk  iterations  for 
CPU-bound  and  disk -bound  jobs  is  16,  32,  or  40.  Again,  sharper  increases 
are  seen  in  the  curves  representing  5C  and  100  terminals,  as  compared 
with  those  curves  for  8  to  20  terminals.  Figure  3-12  illustrates  similar 
behavior  for  an  environnent  where  50%  of  the  jobs  are  CPU-bound.  Thus, 
in  general,  it  can  be  said  that  adding  CPU/disk  cycles  to  the  average  job 
results  in  increased  response  time.  Similarly,  as  demons t ratal  in  Figure 
3-13,  the  increase  in  the  average  disk  wait  time  is  approximately 
proportional  to  the  increase  in  the  number  of  terminals  in  all  operating 
regions  examined. 

Figures  3-14  and  3-15  shew  what  happens  to  the  response  time  as  the 
percentage  of  CPU-bound  jobs  increases.  Generally,  the  response  time 
decreases,  with  the  sharper  changes  taking  place  for  the  curves 
representing  the  larger  number  of  terminals.  Figure  3-14  illustrates  the 
results  of  the  experiments  with  mean  CFU/disk  iteration  count  of  4  for 
the  CPU-bound  jobs  and  12  for  the  disk-bound  jobs  (indicated  as  (4,12)) 
arid  those  with  mean  CPU/disk  iteration  counts  of  0  and  24,  respectively. 
Figure  3-15  plots  the  percentage  CPU-bound  jobs  vs.  resjionse  time  curve 
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Figure  3-10.  Surrmary  of  Performance  Plots 
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Figure  3-13.  Number  of  Terminals  vs.  Average  Disk  Wait 
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for  a  mean  CPU /disk  iteration  count  for  CPU-bound  jobs  of  10  and  for 
disk -bound  jobs  of  30. 

Figure  3-16  and  3-17  illustrate  the  results  of  increasing  the 
average  CPU  service  time  by  a  factor  of  12.  Looking  oniy  at  an 
operating  region  represented  by  8  to  20  terminals,  the  average  response 
time  is  not  affected  greatly,  either  for  a  CPU/disk  iteration  count  of 
(4,12)  or  one  of  (8,24). 

The  wide  range  of  parameter  values  for  the  simulator  that  would  be 
considered  realistic  makes  it  difficult  to  draw  final  and  definitive 
conclusions  frcm  the  experiments  that  have  been  conducted.  It  can  be 
said  that  having  examined  a  set  of  cases  with  a  limited  set  of  parameter 
values,  it  is  clear  that  response  time  is  proportional  to  the  number  of 
terminals  and  the  number  of  CPU/disk  iterations,  and  inversely 
proportional  to  the  percentage  of  CPU-bound  jobs.  As  far  as  hew  these 
scale  factors  actually  are  mathematically  interrelated,  the  curves  show 
that  these  relationships  depend  on  the  operating  region,  i.e.,  whether 
the  system  is  CPU-bound  or  disk-bound  and  whether  there  is  a  snail  (maybe 
20  or  less)  or  large  (more  than  50)  number  of  terminals. 

Further  work  to  make  the  simulator  more  sensitive  to  the  particular 
requirements  of  IDHS  and  to  run  experiments  with  additional  sets  of 
parameter  values  would  permit  more  definitive  analyses  of  the  scale 
factor  interrelationships.  Such  results  would  also  enable  the  scale 
factor-simulation  parameter  equations,  as  described  in  the  earlier 
report,  "Simulator  Variable-Scale  Factor  Equations"  (Appendix  E),  to  be 
completely  derived. 
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Figure  3-16.  Number  of  Terminals  vs.  Average  Response  Time 
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Figure  3-17. 
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3.2.2  Guidelines  on  Which  Parameters  to  Scale 


The  objective  of  the  research  into  the  scaling  of  systems  before 
full  implementation  is  attempted  is  to  improve  the  way  design, 
development,  and  evaluation  of  IMIS  are  performed.  The  ultimate  IDHS  is 
derived  fran  the  scaled  system  in  a  manner  that  decreases  the  final  cost 
and  increases  the  final  benefit  over  that  achievable  without  the  use  of  a 
scaled  system.  Consider  Figure  3-18,  vhich  illustrates  the  relationship 
among  IDHS,  development  of  IDHS,  and  scaled  systens.  Characteristics  of 
intelligence  data  handling  computer  sys tarns,  When  considered  in  light  of 
what  is  known  today  about  computer  system  development,  dictate  a  certain 
cost/benefit  achievable  with  a  given  development  effort.  Suppose  that  a 
scaled  system  is  defined,  based  on  the  ultimate  intelligence  data 
handling  system  objectives,  but  without  seme  of  the  characteristics  that 
contribute  to  increased  cost  and  reduced  benefit.  The  scaled  system 
could  then  be  implemented  at  a  fraction  of  the  cost  of  the  complete 
system,  and  could  furthermore  be  used  to  change  seme  of  the  undesirable 
characteristics  of  the  ultimate  system.  For  example,  one  factor 
increasing  system  cost  is  lack  of  personnel  experience  with  the  system. 
After  developing  a  scaled  system,  project  personnel  will  have  the 
experience  necessary  to  develop  the  canplete  system  at  reduced  cost. 
Thus,  the  ultimate  intelligence  data  handling  computer  system  is  derived 
from  the  scaled  system  in  a  manner  that  decreases  the  final  cost  and 
increases  the  final  benefit  over  that  achievable  without  the  use  of  a 
scaled  system. 

The  unique  problems  entailed  in  implementing  an  IDHS  cempute'r  system, 
are  based  on  a  combination  of  its  characteristics.  Since  understanding 
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how  these  characteristics  might  be  modified  within  a  scaled  system  is 
necessary  to  using  scaled  system  techniques,  major  characteristics  of 
IDHS  will  be  discussed. 

a.  System  uniqueness 

Most  intelligence  data  handling  computer  systems  are  unique. 
Although  most  do  share  cannon  functions,  such  as  caimunications,  each 
system  developed  must  support  specific  mission  requirements  and  interface 
with  specific  other  in-place  systems.  Cost  savings  have  been  realized 
through  transfer  of  technology,  such  as  implementing  National  Military 
Intelligence  Center  (NMIC)  Support  Software  (NSS)  for  the  Preliminary 
Operational  Capability  (POC)  of  the  Pacific  Command  (PACCM)  Data  Services 
Center  (PDSC),  but  uniqueness  is  not  removed  through  this  process.  Thus, 
several  man-years  of  development  were  still  required  for  the  PDSC  POC  due 
to  unique  hardware  interfaces  and  different  computer  configurations.  In 
addition,  implementation  of  many  new  and  unique  capabilities  for  PDSC  is 
currently  underway. 

b.  Security 

All  intelligence  data  handling  crmputer  systems  operate  within 
secure  environments  due  bo  the  classified  information  they  process.  Many 
of  these  systems  are  subject  to  the  especially  stringent  security 
constraints  required  for  processing  sensitive  ccmpartmented  information 
(SCI).  Types  of  security  required  include  physical  (access  restriction), 
personnel  (clearances  required  for  access),  TEMPEST  (electronic 
emanation) ,  and  OOMSEC  (ccnmunications  security  between  systans) .  In 
addition,  computer  hardware/software  security  provides  another  line  of 
defense  against  unautliorized  access  by  preventing  information  retrieval 
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without  knowledge  of  correct  passwcrds  and  codewarus  even  if  physical 
security  is  breached.  Techniques  of  hardware/sof tware  security  are 
expected  to  improve  considerably  in  the  near  future,  as  extranely 
reliable  measures  are  required  to  process  data  of  differing 
classifications  within  the  same  system.  Current  requirements  for  such 
multi-level  secure  processing  have  spurred  research  efforts  such  as 
Kernelized  Secure  Operating  System  (KSOS).  Parameters  related  to 
security  objectives  such  as  file  protection  methods,  granularity  of  data 
access  control,  encryption,  and  authentication  mechanisms  are  potential 
elements  for  scaling. 

c.  Interactive 

Most  intelligence  data  handling  computer  systems  are 
interactive;  that  is,  they  interface  with  users  at  on-line  terminals.  In 
order  to  be  effective,  these  systems  must  provide  rapid  response  to  user 
requests.  Many  of  these  requests  may  require  complex  processing,  and  a 
large  number  of  user  terminals  is  often  supported.  Thus,  the  IWIC  system 
may  be  accessed  fran  over  thirty  terminals,  and  may  process  requests  to 
search  an  entire  five-day  message  file  for  specific  items.  The  number  of 
terminals  and  the  required  responsiveness  can  be  objects  of  scaling 
procedures. 

d.  Real-Time 

In  addition  to  being  >re.  -  ive,  most  intelligence  data 
handling  computer  systems  also  include  components  that  must  operate  in 
real-time.  This  is  particularly  true  for  ccmponents  handling  direct 
sensor  input  or,  as  is  more  oerrmon,  component's  handling  ccrrmunication 
circuitry  and  protocols.  Thus,  the  Intelligence  Data  Handling  System  - 
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Communications  (IDHSC  II)  is  capable  of  controlling  several  ocnrunication 
channels  with  bandwidths  of  9600  baud.  Messages  must  be  processed  as  they 
are  received,  and  must  be  transmitted  with  timeliness.  IDHSC  II 
additionally  performs  sophisticated  packet  switching  and  other  message 
handling  functions,  and  it  is  conceivable  that  bandwidths  of  up  to  50KB 
may  eventually  be  required.  Real-time  operations  can  be  scaled  by 
simulating  real-time  data  with  input  data,  and  transmission  and 
dissemination  functions  can  be  eliminated  for  scaling  purposes. 

e .  State-of-the-art 

Most  intelligence  data  handling  computer  systems  include  at 
least  same  components  that  are  state-of-the-art.  Seme  systens  are  based 
entirely  upon  research  into  state-of-the-art  techniques.  For  example, 
the  Advanced  Indications  System  (AIS)  includes  aspects  relating  to 
artificial  intelligence  and  the  emerging  technology  of  decision  support 
systems.  It  wauld  probably  not  be  desirable  to  scale  state-of-the-art 
features. 

f.  Large  data  base 

Increasing  sophistication  in  intelligence  collection  techniques 
and  expanding  computer  storage  and  processing  capabilities  have  provided 
impetus  toward  development  of  intelligence  data  handling  computer  systems 
with  data  bases  of  ever-increasing  size  and  complexity.  The  Advanced 
Imagery  Requirements  Exploitation  System  (AIRES)  data  base  currently 
consists  of  several  billion  characters  of  on-line  information.  Data  base 
size  ami  the  nunber  of  intra-data  base  linkages  are  prime  candidates  for 
seal ing . 
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g.  Interoperability 

The  vast  amount  of  intelligence  data  collected  and  the 
decentralization  of  mission  responsibility,  particularly  as  is  being 
implemented  under  the  Delegated  Production  Policy,  dictate  that  many 
different  intelligence  data  handling  ocmputer  systans  exist  in  diverse 
geographic  locations.  However,  the  necessity  for  fusion  of  intelligence 
frctn  different  sources  requires  that  camunication  and  interoperability 
be  established  among  these  various  intelligence  data  handling  ccmputer 
systems.  Interoperability  requirements  are  particularly  wide-ranging  for 
national-level  systems  such  as  the  Defense  Intelligence  Agency  (DIA) 
Integrated  Indications  Systems  (DIIS)  currently  being  designed,  which 
will  interface  with  at  least  a  dozen  other  systems.  Different  locations 
can  be  scaled  by  simulating  through  input  data. 

h.  Reliability/Availability 

Many  intelligence  data  handling  computer  systems  operate  on  an 
around-the-clock  schedule,  and  all  are  expected  to  be  available  during 
virtually  10G%  of  their  scheduled  up-time.  V.'ith  many  of  these  systems 
extremely  critical  for  the  national  defense  of  the  United  States,  serious 
degradations  of  reliability  and/or  availability  cannot  be  tolerated. 
Also,  the  extensive  amount  of  interoperability  implemented  causes 
systems  to  depend  on  each  other  and  may  cause  one  iral  functioning  system 
to  adversely  impact  others. 

i.  Changing  Requirements/Evolving  Systans 

Intelligence  collector  technology  growth,  coupled  with  the  long 
lead  times  required  to  implement  data  handling  computer  systans,  often 
causes  system  requirements  to  change  several  times  during  a  development 
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effort.  Furthermore,  the  overall  national  intelligence  data  handling 
capability  is  continually  evolving,  causing  each  individual  intelligence 
data  handling  computer  system  to  similarly  evolve.  The  Ccnnunity  Oi-Line 
Intelligence  System  (OOINS)  provides  a  good  example  of  rational  evolution 
and  requirements  changes  affecting  several  individual  automated  systems. 
COINS  was  initially  implemented  as  a  dedicated  network  directly 
interconnecting  various  large-scale  host  mainframes.  As  additional 
hosts  were  added  to  the  network,  it  became  apparent  that  host 
programming  changes  were  beocming  prohibitively  expensive,  so  a  front-end 
processor  architecture  was  implemented.  The  architecture  also  included 
communication  processors  similar  to  the  Interface  Message  Processors 
(IMPs)  used  on  the  Advanced  Research  Projects  Agency  Network  (ARPANET). 
Sane  network  sites  did  not  have  IMPs  but  did  have  IDHSC  II  processors, 
however,  sc  COINS  protocols  were  implemented  through  IDHSC  II  and 
interfaced  to  other  members  of  the  original  COINS.  Efforts  currently 
underway  with  respect  to  COINS  include  an  experiment  to  eliminate 
dedicated  circuits  by  sending  traffic,  suitably  encrypted,  to  distant 
sites  through  the  actual  ARPANET. 

The  system  parameters  which  apply  to  each  IDHS  characteristic  are 
described  note  fully  in  "Software  Scale  Parameters”  (Appendix  B)  and 
"System  Scale  Factor  Metrics"  (Appendix  C).  Which  of  these 
characteristics  to  scale  depends  on  the  major  objectives  of  the 
development  effort,  making  it  hard  to  quantify  appl  ication-dependrnt 
parameters.  In  addition,  quantifying  software  system  attributes  is  a 
young,  expanding  discipline  in  which  definitions  and  emphases  tend  to 
shift,  contributing  to  the  dynamic  nature  of  the  terminology  and 
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technical  base.  Actual  metrics  and  relationships  may  therefore  be 
resculptured  as  research  continues  toward  the  goal  of  achieving  an 
understandable  and  workable  methodology  for  scaled  systems  development. 

The  experiments  performed  with  the  simulator  indicate  that 
guidelines  for  which  parameters  to  scale  will  have  to  be  narrowly 
defined,  depending  on  such  factors  as  operating  region  and  expected 
system  size,  e.g.,  number  of  terminals.  Although  no  drastic  changes  in 
scale  factor  interrelationships  occur  in  these  different  environments, 
there  is  a  significant  amount  of  variance,  e.g.,  doubling  the  number  of 
terminals  does  not  always  double  the  response  time. 

Figure  3-19  illustrates  the  considerations  in  deciding  what  to 
scale.  It  is  necessary  and  advantageous  to  first  prepare  the  lists  of 
objectives  for  using  both  the  full-scale  system  and  the  scaled  system. 
Full-scale  system  objectives  fall  into  two  categories,  the  general  type 
of  system,  such  as  real-time  versus  batch,  and  any  unique  objectives 
required  for  the  system,  such  as  1001  up-time,  flexibility  to  interface 
with  other  evolving  systems,  simple  transportability,  etc.  For  example, 
a  real-time  data  acquisition  system  oould  be  scaled  on  the  input  data 
rates  or  number  of  input  lines,  while  an  interactive  system  could  be 
scaled  on  the  number  of  users.  The  objectives  of  the  full-scale  system 
are  related  to  the  functions  it  performs,  which  will  aid  in  determining 
which  parameters  to  scale.  The  objectives  for  using  scaled  systems 
within  a  development  effort  will  also  be  factors  in  determining  which 
parameters  to  scale,  as  scaling  certain  parameters  may  clearly  aid  or 
hinder  accompl  isliment.  of  these  object  ives.  As  has  been  discussed  in 
previous  sections,  the  benefits  include  obtaining  user  feedback  for  final 
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design  decisions,  testing  unique  or  state-of-the-art  concepts,  providing 
project  experience  for  the  development  teem,  implementing  an  initial 
operational  capability  which  will  be  later  enhanced,  and  predicting 
full-scale  system  cost,  schedule,  risk,  and  performance .  A  scaled  system 
built  to  establish  the  feasibility  for  a  unique  system  should  not  have 
canplex  or  state-of-the-art  features  scaled,  while  a  scaled  systan  built 
to  elicit  final  user  design  feedback  should  have  the  nunber  of  users,  but 
not  the  user  interface,  scaled.  The  use  of  a  scaled  system  nust  also  be 
cost-effective,  while  "too  much"  scaling  mst  be  avoided  or  it  will  be 
impossible  to  extrapolate  performance  results.  Figure  3-20  presents  a 
surmiary  of  guidelines  in  selecting  which  sample  objectives  to  scale. 

The  value  of  using  the  simulator  to  aid  in  determining  guidelines  as 
to  which  parameters  to  scale  is  that,  for  a  given  set  of  objectives  for 
the  full-scale  system,  tests  can  be  run  with  various  scenarios 
representing  different  sets  of  parameters  scaled  and  the  implications  of 
such  scaling  can  be  easily  and  inexpensively  evaluated.  The  limits 
beyond  which  some  scale  factors  should  not  be  scaled  can  also  be 
determined  in  this  way.  For  example,  it  might  be  seen  that  given  the  set 
of  parameter  values  that  define  the  proposed  system,  halving  the  nirnber 
of  terminals  from  100  to  50  doubles  the  interactive  responsiveness. 
However,  halving  the  number  of  terminals  from  5C  to  25  triples  the 
interactive  responsiveness.  In  this  case,  the  simulator  VNOuld  indicate 
that  the  number  of  terminals  should  not  be  scaled  to  less  than  half 
without  taking  the  change  in  the  terminal  responsiveness  relationship 
into  account. 
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SAMPLE  OBJECTIVES 

PARAMETERS  TO  SCALE 

System  Uniqueness 

? 

Security 

File  protection  methods,  granularity 
of  data  access  control,  encryption, 
authentication 

Interactive 

Number  of  terminals,  required 
responsiveness 

Real-Time 

Input  data  rates,  number  of  input 
lines,  transmission  and  dissemination 
functions 

State-Of-The-Art 

7 

Large  Data  Base 

Data  base  size,  number  of  intra-data 
base  linkages 

Interoperability 

Transmission  and  dissemination  functions 

Reliability /Availability 

7 

Figure  3-20.  Summary  of  Scaling  Guidelines 
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General  guidelines  can  be  derived,  however,  frcm  this  research. 
The  experiments  have  demonstrated  that  responsiveness  is  inversely 
proportional  to  the  number  of  terminals  and  the  number  of  CPU/disV. 
iterations.  It  can  also  be  seen  that,  as  shewn  in  Figures  3-15  and  3-16, 
increasing  the  average  CPU  time  does  not  increase  the  response  time, 
indicating  that  functions  requiring  extra  CPU  time,  e.g.,  security 
overhead,  can  be  moderately  scaled  without  affecting  other  scale  factors. 
The  cost  model  can  then  be  used  to  determine  the  degree  of  scaling  that 
is  both  advantageous  and  feasible. 

3.3  Decision  Factors  and  Guidelines 

The  purpose  of  this  section  is  to  establish  the  basic  and 
generalized  guidelines  which  system  architects  can  refer  to  in 
determining  the  feasibility  and  cost-effectiveness  of  building  a  proposed 
system  to  scale.  Frcm  that  point,  a  more  detailed  discussion  supported 
by  quantitative  exhibits  will  be  presented. 

3.3.1  Overview 

Decision  guidelines  for  potential  scaling  of  proposed  system  designs 
will  most  often  have  as  their  focus,  two  major  questions: 

(1)  Can  the  system  in  question  be  built  to  scale? 

(2)  Will  the  resultant  scaled  system  prove  worthwhile 
both  in  its  possible  operational  value  and  in 
benefits  realized  for  application  to  the  unsealing 
effort? 

3. 3. 1.1  Scaling  Feasibility 

Before  answering  the  first  question,  a  thorough  analysis  of  the 
proposed  system's  characteristics  must  be  performed  in  order  to  establish 
a.  reasonable  scaling  methodology.  Information  of  value  would  generally 
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consist  of  such  items  as  the  system's  requirements,  performance  criteria, 
functionality,  proposed  architecture,  estimated  program  and  data  base 
size,  and  oonf iguration,  both  in  terms  of  its  hardware  and  software.  In 
addition,  level  of  technical  staff  experience  and  qualifications, 
development  schedule,  resource  allocation  ( staff- load ing ) ,  as  well  as  the 
ultimate  target  delivery  date  for  the  full-scale  system  must  be  taken 
into  account.  Once  such  data  is  gathered,  the  formulation  of  a  scaled 
system  development  methodology  may  ocrrmence .  Coupled  with  the  "how"  of 
scaling,  however,  is  the  "why"  of  scaling,  vdiich  raises  the  importance  of 
the  second  question  stated. 

It  should  be  stated  that  the  importance  of  a  scaled  system  lies  not 
in  the  fact  that  the  scaling  can  actually  be  acccmplished,  but  in  the 
benefits  that  actually  accrue  to  the  ultimate  full-scale  system.  It  is 
important  then,  that  the  objectives  of  the  scaled  system  be  established 
early  on.  It  is  additionally  important  to  maintain  the  distinction 
between  the  concepts  of  scaling  and  prototyping.  While  a  scaled  system 
is  most  certainly  a  prototype,  a  prototype  may  not  necessarily  have  the 
properties  of  a  scaled  system.  While  the  potential  value  of  prototype 
systems  is  acknowledged,  the  discussion  of  such  is  considered  beyond  the 

scope  of  this  report. 

In  examining  system  attributes  in  terms  of  scaling  feasibility,  the 
ones  with  the  least  risk  should  be  considered  first.  The  motivation  here 
is  to  scale  attributes  where  there  is  much  certainty  about  their 
full-scale  properties  so  that  relatively  higher-risk  system  ocmponents 
may  be  implemented  and  thoroughly  scrutinized  in  the  scaled  system. 
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Important  attributes  to  keep  in  mind  during  the  design  of  a  scaled 
system  include  its  degree  of  modularity  and  transportability.  In  most 
cases,  the  unsealing  effort  would  certainly  benefit  frem  any  inventory  of 
source  code  accumulated  during  the  scaled  effort.  This,  of  course, 
requires  the  extra  effort  in  planning  since  little  is  known  about  the 
actual  workings  of  the  full-scale  system  at  the  front-end  of  the 
development  cycle  and,  due  to  the  development  metliodology  selected 
(scaling),  it  is  most  certainly  a  complex  and  technically  challenging 
system  defying  any  such  planning  attempts.  Nevertheless,  attention  to 
designing  modular,  transportable  code  for  the  scaled  system  will 
eliminate  the  need  to  produce  similar  ccd^  for  the  ultimate  full-scale 
system  and  will  result  in  cost  savings  for  the  full-scale  system  as  well 
as  a  reduction  in  total  project  costs. 

3. 3. 1.2  Cost  Modeling  and  Parametric  Analysis 

In  the  planning  and  design  stages  for  a  scaled  system,  the  planners 
inevitably  find  themselves  deep  in  the  realm  of  cost  estimation  modeling 
and  parametric  analysis  in  the  determination  of  the  potential  cost 
effectiveness  of  the  development  methodology  chosen.  Such  tools  are 
important  in  the  exploration  of  the  interrelationships  that  exist  between 
the  scaled  systen  and  its  full-scale  counterpart  in  determining  total 
cost,  schedule,  and  risk.  While  hardware  cost  estimation  can  be  achieved 
with  an  acceptable  degree  of  accuracy,  software  cost  estimation  involves 
many  critical  variables  which  aggravate  formulation  of  accurate  cost 
projections.  One  such  variable  is  time;  major  software  development 
efforts  nearly  always  span  a  considerable  ?rount  of  time.  Software  cost 
estimates  therefore  bear  a  significant  degree  of  uncertainty  because  they 
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address  future  events  heavily  dependent  upon  the  interaction  of  a  group 
of  people.  Consequently,  a  small  deviation  in  the  resulting  delivery 
schedule  causes  a  major  impact  upon  costs  because  in  software 
development,  the  burdened  costs  of  maintaining  a  project  staff,  generally 
at  significant  pay-scales,  are  large.  Future  projections  thus  bear  a 
degree  of  uncertainty  proportional  to  the  term  under  consideration; 
long-term  predictions  are  long  on  risk  while  shorter-term  predictions 
involve  relatively  less  risk.  System  hardware  cost  estimation  is 
considered  a  contrast  to  software  cost  estimation  because,  in  the 
procurement  of  hardware,  the  objects  are  generally  "off-the-shelf"  items 
where  the  major  concerns  deal  mostly  with  the  transportation, 
interfacing,  and  check-out  of  the  various  modular  hardware  components  and 
a  relatively  shorter  time  frame  is  involved. 

Due  to  the  difficulty  involved  in  dealing  with  critical  variables, 
such  as  time,  in  the  planning  of  systems,  parametric  analysis  has  become 
a  useful  tool  in  the  naking  of  projections.  Parametric  analysis  can  be 
loosely  defined,  for  the  purposes  of  this  discussion,  as  the  posing  of 
"what  if"  questions;  the  power  of  the  technique  lies  in  its  assessment  of 
the  sensitivities  of  the  various  crucial  variables  present  in  our 
estimate  calculations,  such  as  time. 

3 . 3 . 1 . 3  Scaled  Systems  Decision  Criteria 

Thus,  while  the  feasibility  and  methodology  of  producing  a  system  to 
scale  is  the  primary  responsibility  of  the  system's  architects,  the 
resources  of  a  parametric  analyst  and  a  cost  estimation  method  are 
crucial  in  determining,  at  the  onset,  any  potential  cost  savings  that 
could  occur  through  the  adoption  of  a  scaled  system  development 


methodology.  Cost  savings  are  perceived  as  the  principal  driver  of  the 
scaled  system  design;  however,  it  should  be  emphasized  that  situations 
may  arise  where  potential  cost  savings  are  subordinate  to  full-scale 
product  quality  considerations  such  as  reliability,  efficiency, 
integrity,  and  performance.  Nevertheless,  the  objective  of  the  following 
sections  is  to  provide  the  system  planner  vhth  appropriate  guidelines  by 
which  he  may  mentally  determine  the  feasibility  of  applying  the  scaled 
systems  approach  to  a  particular  software  effort. 

3.3.2  Decision  Factors  Influencing  System  Development 

There  are  a  number  of  research  papers  appearing  in  the  open 
literature  itemizing  factors  vhich  influence  software  development  cost 
and  schedule.  Seme  authors  have  additionally  been  able  to  quantify  the 
effects  of  the  presence  or  absence,  to  varying  degrees,  of  these  factors. 

One  of  the  first  to  do  so  was  J.D.  Aron  in  "estimating  Resources  for 
Large  Programming  Systems"  [ref.  1].  A  result  of  this  study  is 
illustrated  in  Figure  3-21.  In  this  illustration,  we  find  Aron's 
productivity  table  which  relates  code  production  to  factors  such  as 
difficulty,  schedule  duration,  and  interface  ccmplexity.  Of  note  to 
planners  of  scaled  systems  are  the  facts  that,  generally,  the  longer  the 
development  schedule  duration  and  the  less  interface  complexity  and 
difficulty,  the  greater  the  productivity  and,  hence,  the  less  the  cost. 
Of  especial  interest  is  the  counter-intuitive  nature  of  productivity 
presented  in  terms  of  development  schedule  duration;  the  longer  the 
schedule,  the  greater  the  productivity.  This  anomaly  has  been  noted  by 
other  authors  such  as  Brooks  and  Putnam  and  the  phenomenon  is’  perhaps 
best  explained  by  Putnam  [ref.  17],  Yet  even  fran  this  simple  table. 


1 


2 


3 


Duration 

Difficulty  >SSs^ 

6-12 

Month* 

12-24 

Month* 

More  Than 

24  Month* 

Easy 

zo 

500 

US/day) 

10.  000 
(40/day) 

Medium 

10 

250 

<lZ.5/d*y) 

5.  000 
(26/d.y) 

Difficult 

s 

125 

(6.  25/<UT) 

1.  500 
(4/tUjr) 

Instruction* 

per 

Man- Day 

Instruction* 

per 

Man -Month 

Instruction* 

p-r 

M*n-  Year 

Very  Few 
late  recti  on* 


Some 

Interaction* 


Many 
Interaction* 


Figure  3-21.  Aren's  Productivity  Table 


planners  of  scaled  systems  can  be  confident  that  through  the  limiting  of 
a  project’s  size,  scope,  and  ocmplexity  -  seme  of  the  attributes  of  a 
scaled  system  -  productivity  performance  can  indeed  be  increased  and 
total  resources  and  labor  put  to  more  efficient  use. 

Next  to  itemize  software  developmental  factors  was  Doty  (and 
associates)  under  a  research  effort  for  the  tome  Air  Development  Center 
( RADC )  .  In  Software  Cost  Estimating  Study  -  Guidelines  for  Inprcrved 
Software  Cost  Estimating  [ref.  7],  the  authors  identified  forty-six 
factors  vhich  contribute  significant  inpacts  upon  software  project  costs 
and  schedules.  These  forty-six  factors  were  divided  into  three 
homogeneous  groups  and  are  listed  in  Figure  3-22.  In  addition  to  this 
enumeration,  Doty  and  his  associates  were  able  to  formulate  a  set  of 
effort  (cost)  formulae  characterized  by  separations  based  upon 
application  type  and  respective  adjustment  factors  specifically 
accounting  for  seme  of  the  environmental  attributes.  These  formulae  were 
arrived  at  based  upon  the  data  RADC  had  internalized  concerning  over  four 
hundred  software  development  efforts.  The  Doty  cost  formulae  and 
adjustment  factors  appear  in  Figure  3-23.  Individual  environmental 
factors  quantified  in  Figure  3-23  are  identified  in  Figure  3-22  by  an 
asterisk  ("*’*)  alongside  the  corresponding  factor.  It  is  readily 
apparent  that  not  all  of  the  effects  of  the  factors  listed  in  Figure  3-22 
were  quantified.  Presunably  this  is  due  to  the  inherent  difficulty  and 
probable  research  constraints  limiting  the  quantitative  determination  of 
such  effects. 

Of  wide  interest  to  researchers  of  software  engineering  in  general, 
and  cost  and  productivity  modeling  in  particular,  is  an  article* entitled 
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Figure  3-23.  RADC/Doty  Factors  Affecting  Development 


"A  Method  of  Programming  Measurement  and  Estimation,"  by  Felix  and 
Vfelston  of  IBM  [ref-  24].  This  article  first  appeared  in  the  IBM  System 
Journal  Volume  16,  Number  1,  in  1977.  In  this  article,  the  authors 
examined  a  group  of  sixty  oanpleted  software  development  projects  that 
covered  a  wide  range  of  application  type,  size,  and  complexity.  Frcm 
this  research,  the  authors  compiled  a  list  of  productivity  rates  itemized 
by  environmental  or  product  factor.  This  list  is  surmarized  in  Figure 
3-24.  Regrettably,  the  data,  as  presented,  is  not  of  much  use  to  system, 
planners.  The  authors  did  allude  to  a  methodology  whereby  the 
productivity  rates  could  be  incorporated  into  an  estimation  model  but 
unfortunately  they  did  not  elaborate  upon  the  details  necessary  to  apply 
the  methodology  to  practice.  Hence,  under  this  research  effort  the 
attempt  was  made  to  incorporate  this  raw  data  into  a  general  scheme  cf 
guidelines  through  which  system  planners  might  be  able  to  assess  the 
potential  benefits  of  applying  the  scaled  systems  development 
metlxxlology . 

3.3.2.  1  Factor  Quantification 

The  Vtolston  and  Felix  article  is  one  of  the  few  available  sources  of 
quantitative  empirical  data  concerning  the  effects  of  many  various 
environmental  factors  influencing  software  development.  tn  order  to 
obtain  meaningful  decision  factors  from  the  Walston  and  Felix  data 
identified  in  the  previous  section,  it  was  first  necessary  to  saioirv 
translate  the  raw  productivity  rates  into  some  sort  of  predictive 
coefficients  indicating  the  respective  impacts  of  the  development  factors 
on  a  project's  cost,  effort,  or  schedule.  While  it  was  recognized  that 
the  resultant  factors  may  not  apply  to  any  particular  environment,  the 
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Figure  3-24.  H3M  Factors  Affecting 


intent  was  rather  to  formulate  a  set  of  surrogate  values  based  upon 
actual  “real-world"  experience  for  the  purpose  of  rough  effort 
approximating  and  scaled  system  trade-off  analysis.  Again,  the  purpose 
of  such  an  exercise  wuld  be  to  provide  the  system  planner  with  a  tool  to 
facilitate  his  assessment  of  the  feasibility  of  adopting  the  scaled 
system  develops  ntal  approach . 

After  rationalizing  that  the  Doty  coefficients  must  have  been  based 
upon  similar  productivity  data  as  that  vfriich  Walston  and  Felix  provide 
(except  obtained  from  a  different  source  -  RADC),  it  was  determined  that 
the  Doty  method  would  be  an  attractive  model  to  base  the  determination  of 
the  coefficients  frcm  the  Vtelston  and  Felix  data  upon.  In  addition, 
there  would  be  benefits  to  representing  both  sets  of  data  in  the  same 
manner  as  they  would  compliment  each  other.  In  retrospect,  the  Doty 
method  to  account  for  environmental  factors  consisted  of  coefficients 
that,  when  multiplied  together,  produced  a  multiplicative  factor  that 
could  be  used  in  an  equation  of  the  form: 

Person  months  of  Effort  =  Constant  *  SLOC  T  Exponent  *  ii 
vhere:  "ii"  is  the  multiplicative  factor 

Of  particular  value  in  the  Doty  method  is  the  fact  that  \*hile  each 
environmental  factor  value  not  only  relates  its  marginal  inpact  upon  a 
project's  estimated  cost,  it  is  expressed  in  a  form  such  that  its  implied 
interrelationship  with  the  other  factors  is  automatically  accounted  for. 
Other  organizations  and  researchers  have  used  these  same  environmental 
factors  and  their  corresponding  coefficients  for  other,  different 
estimating  purposes  in  their  original  form  with  acceptable  degrees  of 
success.  A  prime  example  would  be  the  Space  and  Missile  Systems 


Organization's  (SAMSO)  Software  Programs  Office  (SPO)  in  Los  Angeles, 
California,  where  the  Doty  factors  and  Coefficients  are  used  to  adjust 
the  technology  constant  in  Putnam's  software  equation  [ref-  13].  The 
Putnam  equation  relates  system  size  to  total  project  effort  and  Schedule 
through  the  technology  constant  and  is  totally  different  from  the  Doty 
methodology.  The  problem,  then,  was  to  quantify  the  IBM  data  in  a 
similar  manner  to  the  Doty  methodology.  Traditional  systems  thinking 
techniques  were  applied  to  the  problem  first  -  problem  solution  through 
problem  decomposition.  Step  one  consisted  of  combining  related 
development  factors  into  groups .  The  resulting  groupings  are  shown  in 
Figure  3-25.  Of  concern  to  this  research  was  the  fact  that  the 
aggregation  of  the  effects  of  the  individual  development  factors  tended 
to  over- emphasize  the  resulting  productivity  estimates.  It  was 
subsequently  determined  that  the  original  data  did  not  result  fran  "pure" 
laboratory  research  conditions  and  that  the  mere  presence  of  sane 
environmental  factors  implied  other,  related  factors-  Fbr  example,  the 
IBM  data  might  be  interpreted  to  suggest  that  the  presence  of  both 
structured  programning  techniques  (303  Lines  of  Code  per  Month  -  LOC/M) 
and  top-down  design  (319  LOC/M)  would  result  in  productivity  of  622 
soiree  lines  per  month,  which,  from  the  other  data  present,  seems 
questionable.  Top-down  development  and  structured  programming 
techniques,  in  all  probability,  occurred  simultaneously  in  the  Vfelston 
and  Felix  project  data  base;  hence,  additive-type  analytical  techniques 
of  the  published  data  would  tend  to  ever- emphasize  the  effects  of  the 
various  development  factors.  This  simplistic  example  illustrates  the 
problem  of  attempting  to  aggregate  the  resultant  effects,  in  “terms  of 
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Figure  3-25.  IBM/Walston  and  Felix  Environmental  and  Product  Factor  Groupings 
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productivity,  of  the  various  factors  in  determining  guidelines  based  upon 
such  data.  The  real  problem  with  the  data,  as  we  perceive  it,  is  that  it 
does  not  result  from  purely  controlled  situations.  Of  course,  it  is  not 
expected  to  as  it  is  recognized  that  the  gathering  of  such  data  under 
pure  laboratory  conditions  is  far  too  expensive  and  time  oonsuning,  even 
if  it  were  possible.  The  task  to  be  performed  then  was  perceived  as 
inferring,  through  seme  quantitative  basis,  the  effects  of  the  combined 
envirormental  factors.  This  was  first  applied  through  quantifying  the 
aggregated  effects  of  all  the  factors  in  each  particular  group  of  related 
factors,  lb  accomplish  this,  the  extreme  low-  and  high-end  productivity 
rates  for  each  component  of  a  group  were  totaled.  A  marginal  group 
productivity  impact  was  then  calculated  based  upon  these  totals  through 
the  following  equation: 

Marginal 

[3.2-a]  High  total  -  low  total  =  Aggregate 

-  Productivity 

lew  total  Impact  for  Group 

The  marginal  productivity  impacts  of  each  group,  and  the  data  used  in 
arriving  at  them,  are  illustrated  in  Figure  3-26.  Frcm  this  illustration 
we  see  that  the  components  of  the  group  "Structured  Programming" 
contribute  positively  to  productivity  by  a  factor  of  1.7.  The  fact  that 
this  translates  to  a  70%  increase  in  productivity  for  all  organizations 
and  environments  is  undeterminable;  however,  in  the  IBM  development  arena 
for  a  similar  set  of  projects  as  those  which  constitute  the  IBM  data,  it 
would  be  reasonable  to  expect  that  these  practices  would  contribute  to  a 
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Group  Productivity  Impact 

Stuctured  Techniques  1.70 

Structured  Programing 
Design  and  Code  Inspections 
Top-down  Development 
Chief  Programmer  Teams 

Complexity  1.69 

Overall  Code  Complexity 
Complexity  of  Application 
Complexity  of  Program  Control  Flew 

Code  Mix  1.47 

Proportion  of  Code  classed  as  Mon-Mathematical  and  I/O  Formatting 
Proportion  R/T,  Interactive,  or  Time-Critical  Oode 
Proportion  of  Cbde  Intended  for  Delivery 

Utilization  1.86 

Overall  Program  Design  Constraints 
Core  Memory  Design  Constraints 
Execution  Time  Design  Constraints 

Platform  1.92 

Customer  Interface  Complexity 
Degree  of  User  Participation  in  Req'mts.  Def. 

Degree  of  Customer-Originated  Design  Changes 
Degree  of  Customer  Experience  in  Application  Area 

Resources  2.73 

Ave.  Personnel  Experience  and  Qualifications 
%  Dev'mt.  Programmers  who  Participated  in  Func.  Design  Spec. 

%  Utilization  of  Currently-available  Hardware 
Degree  of  Previous  Experience  with  Cper .  Computer 
Degree  of  Previous  Experience  with  Programming  Languages 
Degree  of  Previous  Experience  with  Appl.  Size  and  Complexity 

Security  1.72 

Spcl.  Req.  for  Access  to  Dev'mt.  CPU 
Amount  of  Cpen  Access  Time  to  Dev'mt.  CPU 

Classified  Security  Ehvirorment  for  CPU  and  25%  of  Programs  &  Data 

Misc.  Items  *  Mot  Calculated  * 

Proportion  of  Data  Base  Class-Items  to  1,000  LOC 
Proportion  of  Doc.  Pages  to  1,0GG  LOC 
Ratio  of  Staff  Size  to  Project  Duration  (People/Month) 


Figure  3-26.  IBM/ Walston  and  Felix  Marginal  Group  Productivity  Impacts 


Croup  Productivity  Inpact 


Resources  2.73 

Ave.  Personnel  Experience  and  Qualifications 
■%  Dev'mt.  Programmers  who  Participated  in  Func.  Design  Spec . 

%  Utilization  of  Currently-available  Hardware 
Degree  of  Previous  Experience  with  Cper.  Computer 
Degree  of  Previous  Experience  with  Programming  Languages 
Degree  of  Previous  Experience  with  Appl.  Size  and  Complexity 

Platform  1.92 

Customer  Interface  Complexity 
Degree  of  User  Participation  in  Req'mts.  Def. 

Degree  of  Customer-Originated  Design  Changes 
Degree  of  Customer  Experience  in  Application  Area 

Utilization  1.86 

Overall  Program  Design  Constraints 
Core  Memory  Design  Constraints 
Execution  Time  Design  Constraints 

Security  1.72 

Spcl .  Req.  for  Access  to  Dev'mt.  CPU 
Amount  of  Cpen  Access  Time  to  Dev'mt.  CPU 

Classified  Security  Environment  for  CPU  and  25%  of  Programs  S.  Data 

Stuctured  Techniques  1.70 

Structured  Programming 
Design  and  Code  Inspections 
Top-down  Development 
Chief  Programmer  Teams 

Complexity  1.69 

Overall  Code  Complexity 
Complexity  of  Application 
Complexity  of  Program  Control  Flew 

Code  Mix  1.47 

Proportion  of  Gode  classed  as  Non-Mathematical  and  I/O  Formatting 
Proportion  R/T,  Interactive,  or  Time-Critical  Code 
Proportion  of  Code  Intended  for  Delivery 

Misc.  Items  *  Not  Calculated  * 

Proportion  of  Data  Base  Class-Items  to  1 , 000  LOC 
Proportion  of  Doc.  Pages  to  1,000  LOC 
Ratio  of  Staff  Size  to  Project  Duration  (People/Month) 


Figure  3-26a.  IBM/Walston  and  Felix  Marginal  Group  Productivity  Impacts 
-  Listed  in  Order  of  Precedence 
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productivity  inc  ease  on  the  order  of  70%.  Of  subsequent  interest  is  the 
individual  contribution  from  each  component  comprising  the  group.  To 
arrive  at  these  individual  contribution  factors,  an  equation  of  the 
following  form  had  to  be  solved  for: 

Marginal 

Group  =  (HC1)  x  (1-+C2)  x  ....  x  (l-*Cn)  [3.2-b] 

Productivity 

Inpact  where:  Cl-n  represent  the  marginal  contributions 

of  each  group's  caiponent  members 

Obviously,  this  is  no  trivial  task  and  it  appears  that  the  possible 
component  values  could  take  on  any  one  of  a  wide  range  of  possible 
values .  fortunately,  the  solution  can  be  determined  due  to  the  implied 
variable  relationships  that  exist  in  the  basic  productivity  data. 
Through  the  data,  the  basic  equation  of  the  form  3.2-b  could  be 
translated  to  a  form  described  by  only  one  of  the  variables  where  the 
remaining  variables  are  defined  through  the  one  variable  and  a  ratio 
calculated  from  the  original  data.  This  translation,  coupled  with  the 
facility  of  a  digital  computer,  greatly  simplifies  the  solution 
procedure.  This  basic  solution  procedure  is  illustrated  mathematically 
in  Figure  3-27.  In  this  example,  it  is  shown  hew  the  caiponent  marginal 
contribution  rates  of  the  components  of  the  "Structured  Techniques"  group 
were  determined.  The  equation  of  the  single  variable.  A,  was  solved  for 
on  an  interactive  computer  system  through  a  program  utilizing  an 
iterative  solution  technique.  With  this  same  procedure,  the  remaining 
contribution  factors  of  the  groups  could  be  solved  for  and  the  results 
are  provided  in  Figure  3-28.  Figure  3-29  summarizes  all  of  the  group 
productivity  impacts  as  well  as  the  caiponent  contributions  of  each 
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Productivity  (LSC/PM) 


Structured  Techniques 

No 

Yes 

%  Increas 

A) 

Structured  Programming 

169 

301 

78 

B) 

Design  and  Code  Inspections 

220 

339 

54 

C) 

Top-down  Development 

196 

321 

64 

D) 

Chief  Programmer  Teams 

219 

408 

86 

Totals  - 

804 

1369 

These 

four  factors  affect  productivity  by 

1369 

-  804 

=  +  70% 

804 


>  70  %  is  this  Group's  Marginal  Productivity  Inpact. 


Relationships : 

t 


(1+A)  x  (1+B) 

X 

(HC)  x  (1-H)) 

S 

1.7 

A 

/=/  0.16 

A/B 

=  78/54 

B 

=  54*A/78 

B 

/=/  0-11 

A/C 

=  78/64 

C 

=  64*A/78 

C 

/=/  0.13 

A/D 

=  78/86 

D 

=  86*A/78 

D 

/=/  0.17 

Solutions  found  by: 

(14A)  x  (l+(54*A/78) )  x  (l+(64*A/78) )  x  (1+(86*a/78))  /=/  1.70 


Notes:  "LSC/PM"  means  "Lines  of  Source  Code  per  Person  Month" 

"  x  "  symbolizes  arithmetic  multiplication 

"  /*/  "  means  -  "approximately  equal  to" 


Figure  3-27.  Sample  Calculation  of  Group  Component  Contributions 
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Productivity  (LSC/PM) 


Structured  Techniques 

No 

Yes 

%  Increas 

A) 

Structured  Programming 

169 

301 

-78 

B) 

Design  and  Code  Inspections 

220 

339 

54 

C) 

Top-down  Development 

196 

321 

64 

D) 

Chief  Programmer  Teams 

219 

408 

86 

Totals  - 

804 

1369 

These 

four  factors  affect  productivity  by 

1369 

-  804 

=  +  70% 

=====  ■>  7G  %  is  this  Group’s  Marginal  Productivity  Impact. 


Relationships: 


(1+A)  x 

(1+B)  x 

(14C)  x  (1+D) 

= 

1.7 

A 

/=/ 

0.16 

A/B 

=  78/54 

B 

=  54*A/78 

B 

/=/ 

0.11 

A/C 

=  78/64 

C 

=  64*A/78 

C 

/=/ 

0.13 

A/D 

=  78/86 

D 

=  86*A/78 

D 

/=/ 

0.17 

Solutions  found  by: 

(1+A)  x  (l+(54*A/78) )  x  (l+(64*A/78) )  x  (l+(86*A/78) )  /=/  1.70 


Notes: 


"LSC/PM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28.  Calculation  of  Group  Component  Contributions 
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Productivity  (LSC/PM) 


"Average" 

Carpi exity  >  <  %  Increase 


A) 

Overall  Code  Complexity 

185 

314 

69 

B) 

Complexity  of  Application 

168 

349 

108 

c) 

Complexity  of  Program  Control  Flow 

209 

289 

38 

Totals  -  562  952 


These  three  factors  affect  productivity  by  952  -  562  =  +  69% 

562 

■=  — ~>  69  %  is  this  Group's  Marginal  Productivity  Inpact. 


Relationships : 


(1+A)  x  (1+fi)  x  ( 1+C)  =  1.69 


A/B  =  69/108 
A/C  =  69/  38 


B  =  108*A/69 

C  =  38*A/69 


A  /=/  0.19 

B  /=/  0.29 

C  /=/  0.10 


Solutions  found  by: 


(HA)  x  (l+(108*A/69) )  x  (l+(38*A/69) )  /=/  1.69 


Notes: 


"LSC/PM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equa1  to" 


Figure  3-28  (Cont.).  Calculation  of  Group  Component  Contributions 


Productivity  (LSC/PM) 


"Relatively:" 

Code  Mix 

Ltl 

Mch 

%  Increase 

A) 

Non-math;  I/O  Formatting 

188 

267 

42 

B) 

Non-Real- time,  nor  time-critical 

203 

279 

37 

C) 

%  Intended  for  Delivery 

159 

265 

67 

Totals  - 

550 

811 

These 

three  factors  affect  productivity 

by  811 

-  550 

=  +  47% 

550 


"  ■•=>  47  %  is  this  Group's  Marginal  Productivity  Inpact. 


Relationships : 


(1+A) 

x  (1+B)  X 

(1+C) 

1.47 

A/B  =  42/37 
A/C  =  42/67 

B  = 

C  = 

37*A/42 

67*A/42 

A  /=/ 

B  /=/ 

c  /=/ 

0.12 

0.10 

0.19 

Solutions  found  by: 

(1+A)  x  (l+(37*A/42) )  x  (l  +  (67*A//42) )  /=/  1.47 


Notes: 


"LSC/FM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28  (Cont.). 


Calculation  of  Group  Cfcmponent  Contributions 
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Utilization 


Productivity  (LSC/PM) 
Severe  Minimal  %  Increase 


A) 

Program  Design  Constraints 

166 

293 

77 

B) 

Core  Memory  Constraints 

193 

391 

103 

C) 

Execution  Time  Constraints 

171 

303 

77 

Totals  - 

530 

987 

These 

three  factors  affect  productivity  by 

987 

-  530  * 

+  86% 

3^ 


===>  86  %  is  this  Group's  Marginal  Productivity  Inpact. 


Relationships : 


(1+A)  x  (1+B)  x  (1+C)  =  1.86 


A/B  =  77/103  B  =  103*A/77 

A/C  =  77/  77  C  =  A 


A  /=/  0.21 

B  /=/  0.27 

C  /=/  0.21 


Solutions  found  by: 


(1+A)  x  (l+(103*A/77) )  x  (1+A)  /=/  1.86 


Notes: 


"LSC/PM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28  (cant.).  Calculation  of  Group  Component  Contributions 
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Productivity  (LSC/PM) 


"Normal " 


Platform 

> 

< 

%  Increase 

A) 

Customer  Interface  Ccnplexity 

124 

500 

303 

Degree  of  User  - 

B) 

Participation  in  Requirements  Spec. 

205 

291 

42 

C) 

Originated  Design  Changes 

1% 

297 

52 

D) 

Experience  in  Application  Area 

206 

318 

54 

Ttotals  - 

731 

1406 

These 

four  factors  affect  productivity  by 

1406 

-  731 

=  +  92% 

731 


• >  92  %  is  this. Group's  Marginal  Productivity  Impact. 


Relationships : 


(1+A)  x  (1+B) 

X 

(l-tC)  x  (1+D) 

— 

1.92 

A 

/=/ 

0.34 

A/B 

=  303/42 

B 

=  42*A/303 

B 

/=/ 

0.12 

A/C 

=  303/52 

C 

=  52*A/303 

C 

/=/ 

0.13 

A/D 

=  303/54 

D 

=  54*A/303 

D 

/=/ 

0.13 

Solutions  found  by: 


(14A)  x  (l+(42*A/303) )  x 


(l+(52*A/303) )  x 


(l+(54*A/303) ) 


/=/  1.92 


Notes: 


"LSC/FM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28  (Cont.)« 


Calculation  of  Group  Component  Contributions 


3-77 


Productivity  (LSC/PM) 


Resources 

Lew 

Hgh  % 

Increas 

Quality  of  Currently-available  Resources: 

A) 

Average  Personnel  Experience 

132 

410 

2il 

B) 

%  of  Prgmrs  who  did  Design 

153 

391 

156 

C) 

%  Utilization  of  Currently- 

available  Hardware 

177 

297 

68 

Degree  of  Previous  Experience: 

D) 

-  with  the  Computer 

146 

312 

114 

E) 

-  with  the  Programming  Language 

122 

385 

216 

F) 

-  with  a  Similar  Application 

146 

410 

181 

Totals  - 

731 

1406 

These 

six  factors  affect  productivity  by 

1406 

-  731  = 

+  173% 

731 


==>  173  %  is  this  Group's  Marginal  Productivity  Impact. 


Relationships : 


(1+A)  x 

(1+B)  x  (1+C) 

X 

(1+D)  x  (1+E) 

x  (14F) 

A  /=/ 

0.22 

A/B 

=  211/156 

B 

=  156*A/211 

B 

H 

0.18 

A/C 

=  211/  68 

C 

=  68*A/211 

C 

/=/ 

0.12 

A/D 

=  211/114 

D 

=  114*A/211 

D 

/=/ 

0.15 

A/E 

=  211/216 

E 

=  216*A/211 

E 

/=/ 

0.23 

A/F 

=  211/181 

F 

=  181 * A/211 

F 

/=/ 

0.20 

Solutions  found  by: 


(1+A)  x  (1+(156*A/211) '  x  (1+(68*A/211 ) )  x  (1+(114*A/211) ) 
x  (1+(216*A/211 / )  x  (1+{181*A/211 ) )  /=/  2.73 


Notes:  "LSC/PM"  means  "Lines  of  Source  Code  per  Person  Month" 

“  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28  (Cont.).  Calculation  of  Group  (Component  Contributions 
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Productivity  (LSC/PM) 


Security 

Low 

Hgh 

%  Increase 

A) 

Access  Limited  to  Computer 

226 

357 

56 

B) 

Anoint  of  Open  Access  to  Computer 

170 

303 

78 

C) 

%  of  Wbrk  vhich  is  Classified 

156 

289 

85 

Ttotals  - 

552 

949 

These 

three  factors  affect  productivity  by 

949 

-  552 

=  +  72% 

531 


72  %  is  this  Group's  Marginal  Productivity  Inpact. 


Relationships : 


(HA) 


A/B  =  58/78 

A/C  =  58/85 


(1+B)  x  (HC) 

B  =  78* A/ 58 
C  =  85*A/58 


1.72 


A  /=/  0.18 

B  /=/  0.20 

C  /=/  0.21 


Solutions  found  by: 


(HA)  x  (1+(216*A/211 ) )  x  (1+181  *A/211) )  /=/  2.73 


Notes: 


"LSC/FM"  means  "Lines  of  Source  Code  per  Person  Month" 
"  x  "  symbolizes  arithmetic  multiplication 

"  /=/  "  means  -  "approximately  equal  to" 


Figure  3-28  (Cont) .  Calculation  of  Group  Component  Contributions 
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Marginal  and  Component 

Contribution  Inpacts 

Resources  2.73 

Ave.  Personnel  Experience  and  Qualifications  1.22 

%  Dev'mt.  Programmers  vAto  Participated  in  Pune.  Design  Spec .  1.18 

%  Utilization  of  Currently-available  Hardware  1.12 

Degree  of  Previous  Experience  with  Oper.  Computer  1.15 

Degree  of  Previous  Experience  with  Programming  Languages  1.23 

Degree  of  Previous  Experience  with  Appl .  Size  and  Complexity  1.20 

Platform  1.92 

Customer  Interface  Complexity  1.34 

Degree  of  User  Participation  in  Req’mts.  Def.  1.12 

Degree  of  Customer-Originated  Design  Changes  1.13 

Degree  of  Customer  Experience  in  Application  Area  1.13 

Utilisation  1.86 

Overall  Program  Design  Constraints  1.21 

Core  Memory  Design  Constraints  1.27 

Execution  Time  Design  Constraints  1.21 

Security  1.72 

Spcl.  Req.  for  Access  to  Dev'mt.  CPU  1.18 

Amount  of  Cpen  iccess  Time  to  Dev'mt.  CPU  1.20 

Classified  Security  Environment  for  CPU  and  25%  of  Programs  S>  Data  1.21 

Stuctured  Techniques  1 • 70 

Structured  Programming  1 . 16 

Design  and  Code  Inspections  1.11 

Top-down  Development  1.13 

Chief  Programmer  Teams  1.17 

Complexity  1.69 

Overall  Code  Complexity  1.19 

Complexity  of  application  1.29 

Complexity  of  Program  Control  Flo/  1.10 

Code  Mix  1.47 

Proportion  of  Code  classed  as  Non-Mathematical  and  I/O  formatting  1.12 

Proportion  R/T,  Interactive,  or  Time-Critical  Code  1.10 

Proportion  of  Code  Intended  for  Delivery  1-19 

Misc.  Items  *  Not  Calculated  * 

Proportion  of  Data  Base  Class-Items  to  1,000  LOC 
Proportion  of  Doc.  Pages  to  1,000  LOC 
Ratio  of  Staff  Size  to  Project  Duration  (People/Month) 

Figure  3-29.  Suimary  of  IBM/Vhlston  and  Felix  Group  and  Component 
Contribution  Impacts  on  Software  Program  Development 
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environmental  factor  itemized.  With  the  data  provided  by  Aron,  Doty  el. 
al . ,  and  Vfelston  and  Felix,  general  guidelines  governing  the  application 
of  seeded  system  techniques  can  be  presented. 

3.3.3  Generalized  Guidelines  for  Scaling  Systems 

After  determining  the  Vfelston  and  Felix  margined  grotp  productivity 
inspects  listed  in  Figure  3-26  through  the  procedures  already  described, 
the  groups  could  then  be  ranked  according  to  their  potential  inpact  qpon 
a  software  development  effort.  This  has  already  been  dene,  as  may  have 
been  noticed,  in  Figure  3-26a.  Consequently,  the  s urinary  format  of 
Figure  3-29  adheres  to  the  same  ranking.  The  significance  of  this 
ranking  to  the  system  planner  is  the  relative  importance  of  each 
environmental  attribute  to  the  construction  of  cost  effective  software 
systems.  The  Doty  factors  were  ranked  in  a  similar  manner  and  are 
presented  in  Figure  3-30.  Of  importance  to  the  potential  practitioner  of 
the  scaled  system  development  methodology  are  the  priorities  that 
personnel  experience,  use  of  available  hardware,  and  establishment  of 
operational  and  functional  requirements  hold  in  the  determination  of  an 
average  productivity  estimate  reflecting  the  software  development  effort. 

From  the  data  presented,  quantitative  guidelines  such  as  those  that 
follow  may  be  observed: 

3. 3. 3.1  Personnel  Experience 

Scaled  systems  benefit  envirorments  characterized  by  inexperienced 
technical  staffs  and/or  technical  staffs  faced  with  the  challenge  of 
developing  state-of-the-art  or  otherwise  unique  systems.  EVcm  the 
Walston  and  Felix  data,  the  experience  an  otherwise  inexperienced 
technical  staff  gains  frem  a  scaled  system  implementation  can  be' expected 
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Factor  Effort  Increase 

(Maximum) 

CPU  Time  Constraints  132% 
Concurrent  Software  and  Hardware  Development  122% 
Development  CPU  Different  from  Target  CPU  122% 
Detailed  Definition  of  Operational  Requirements  100% 

First  Software  Developed  cn  CPU  92% 
Development  at  More  than  Cne  Site  75% 
Real-Time  Operation  67% 
Limited  Programmer  Access  to  Computer  50% 
Developer  Using  Computer  at  Another  Facility  43% 

CPU  Memory  Constraints  43% 
Special  Display  43% 
Development  at  Operational  Site  39% 
Changing  Operational  Requirements  5% 


Time-Share,  Interactive  Development  £ decreases  effort  -  ]  21% 


Note:  Percentage  figures  came  from  maximum  factors  for  the  particular 
environmental  attributes  listed  in  Figure  3“23. 


Figure  3-30.  Doty  Factors  Ranked  in  Order  of  Adverse  Inpact  on  Software 
Development 
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to  increase  productivity  in  the  unsealing  effort  by  a  factor  of 
approximately  173%.  This  translates  to  approximately  three  times  the 
average  productivity,  or  approximately  one-third  the  effort,  otherwise 
expected  of  an  inexperienced  staff.  In  contrast,  the  Doty  factors 
reflect  that  up  to  a  92%  increase  may  be  achieved  based  upon  the  staff 
gaining  familiarity  with  the  computer  equipment  alone.  Barry  Boelm,  in 
Practical  Strategies  for  Developing  Large  Scale  Software  Systems, 
quantified  the  resulting  benefits  of  an  experienced  staff  to  be  on  the 
order  of  150-200%  [ref.  9].  Despite  the  various  sources,  all  of  this 
data  appears  to  be  relatively  consistent.  This  is  reasonable  since  the 
performance  of  people  should  not  be  expected  to  change  much  ever  time. 

3. 3. 3. 2  Customer  Environment  -  The  Platform 

Software  system  implementors  faced  with  customer  environments 
characterized  by  such  factors  as  complex  customer  interface  channels  and 
procedures,  customer  uncertainty  and  inexperience  regarding  operational 
and  functional  requirements,  and  the  potentially  numerous 
customer-originated  design  changes  which  subsequently  result  from 
inadequate  requirements  specification  can  greatly  benefit  fran  the  scaled 
systems  approach. 

Dr.  Daniel  Teichroew,  a  professor  of  industrial  and  operations 
research  in  charge  of  the  ISDOS  (Information  Systems  Design  and 
Organization  System)  project  at  the  Lhiversity  of  Michigan  v*ho  is  also 
credited  with  the  development  of  the  Problem  Statement  Language/Problem 
Statement  Analyzer  (PSL/PSA),  points  out  that  the  front-end  stage  of  an 
implementation,  where  the  requirements  and  high-level  design  are 
specified,  is  the  pitfall  of  past  failures.  In  his  words,  "this  often 
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overlooked  phase  is  vhere  most  of  the  problems  and  potential  payoffs  lie 
(in  software  development  projects)"  [ref.  20]. 

3. 3. 3. 3  Firm  Requirements  Specifications 

An  appropriately  scaled  system  can  provide  difficult  eustoner 
environments  with  the  experience  and  knowledge  necessary  for  the 
determination  of  the  precise  customer  needs. 

The  importance  of  establishing  the  user's  needs,  in  any  effort, 
cannot  be  overstated.  TV*o  adverse  situations  may  develop  in  the  absence 
of  a  firm  specification  of  the  user's  needs:  (a)  the  developer  implements 
a  system  based  on  an  incomplete  or  incorrect  specification,  and  the 
system  is  rejected  by  the  users,  or  (b)  the  developer  continually  changes 
the  system  design  based  cn  conflicting  direction  from  the  user.  The 
first  situation  results  in  a  system  that  fails  to  meet  performance  and 
functionality  requirements,  while  the  second  situation  greatly  increases 
system  cost  and  development  time,  and  also  runs  the  risk  of  entering  a 
never-ending  change  cycle  in  which  the  system  is  never  actually 
completed . 

In  most  cases,  failure  to  have  a  firm  user  specification  is  not  the 
fault  of  the  user,  but  is  rather  due  to  the  user  having  incomplete 
information  as  to  exactly  what  is  feasible  and  practical  with  an 
automated  system.  A  scaled  system  built  at  a  fraction  of  full-scale 
system  cost  can  be  used  to  demonstrate  exactly  vhat  capabilities  are 
available  to  the  user,  as  well  as  to  give  the  user  an  idea  of  how  he  will 
interface  with  the  system  and  vhat  procedures  must  be  developed.  Based 
upon  his  experience  with  a  scaled  system,  the  user  will  then  be  able  to 
clearly  specify  his  requirements  for  the  full-scale  system. 
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The  potential  benefits  resulting  frcm  such  use  of  scaled  systems  are 
large.  Design  changes  as  a  result  of  requirements  errors  have  been 
documented  [ref.  9,10]  as  being  50  to  400  times  more  expensive  in 
comparison  to  front-end  design  changes  (before  actual  implementation 
ccnmences).  Frcm  the  Vfelston  and  Felix  data,  the  potential  for  benefits 
in  the  unsealing  effort  are  on  the  order  of  92%  -  nearly  twice  the 
productivity,  or,  conversely,  half  the  effort.  The  Doty  factor  for 
changing  operational  requirements  seems  a  contradiction  to  other 
research,  as  it  assesses  a  mere  5%  penalty  for  changing  requirements. 
While  this  low  value  cannot  be  explained,  it  also  cannot  be  corroborated 
with  any  other  research  examined  under  this  effort. 

3. 3. 3. 4  Hardware  Choice 

In  the  event  the  scaled  system  can  provide  information  facilitating 
the  choice  of  hardware  such  that  ultimate  full-scale  system  speed  and 
memory  constraints  may  be  more  readily  complied  with,  the  Walston  and 
Felix  data  suggests  a  potential  86%  increase  in  productivity  applying  to 
the  unsealing  implementation  effort. 

3. 3. 3. 5  Secure  Environments 

Planners  of  intelligence  systems  may  additionally  be  interested  in 
the  fourth  most  important  cost  driver  identified  in  the  Walston  and  Felix 
data,  that  of  secure,  or  classified,  operating  and/or  development 
environments.  The  potential  for  benefits  resulting  frcm  implementing  an 
otherwise  classified  system  in  an  unclassified  environment  through  the 
use  of  dutmy  test  data  and  other  techniques  approaches  72%. 

-Such  examples  are  provided  to  illustrate  to  system  planners  the 
general  method  of  determining  guidelines  through  analysis  of  factors 


pertinent  to  system  costs.  Consequently,  they  may  utilize  this  data  or 
make  use  of  new  data  as  it  becomes  available  to  determine  the  benefits 
and  potentials  existing  in  the  scaled  system  methodology  as  it  may  apply 
to  any  particular  software  development  endeavor.  The  potential  benefits 
realized  through  experience  and  general  system  information  are  paramount 
considerations  in  the  application  of  the  scaled  systems  approach. 

3.3.4  Use  of  the  Individual  Walston  and  Felix  Group  Component  Rates 

It  is  reasonable  that  the  individual  Vfelston  and  Felix  component 
contribution  rates  can  also  be  interpreted  in  an  analogous  manner  to  the 
application  of  the  generalized  group  productivity  inpacts.  For  example: 
3. 3.4.1  State-of-the-Art  Hardware 

In  the  event  that  a  system  requires  state-of-the-art  hardware  not 
yet  available,  the  Walston  and  Felix  data  predicts  the  scaled  system 
implementation  effort  can  benefit  by  a  12%  increase  in  productivity 
through  the  use  of  currently-available  hardware.  This  could  include 
special,  reusable  hardware  specifically  programed  and  configured  for 
scaled  system  implementations .  In  contrast,  the  Doty  factors  quantify 
the  savings  realized  through  the  elimination  of  the  environmental  factor 
of  concurrent  hardware  development  with  the  software  effort  to  be  on  the 
order  of  112%.  Such  hardware  substitution  is  greatly  facilitated  through 
the  increasing  use  of  higher-order  (HQL)  programming  languages  and,  with 
the  approaching  standardization  within  the  CoD  environment  to  ADA,  scaled 
systems  will  be  more  readily  transportable  for  later  enhancement 
regardless  of  what  hardware  is  used  to  implement  that. 


A  major  point  made  in  "Scaled  Systems  Cost  Effectiveness",  a 
technical  memorandcm  submitted  earlier  under  this  research  effort 
(reprinted  in  Appendix  A),  was  the  application  of  cost/benefit,  or 
"break-even",  analytical  techniques  to  assessing  the  cost  trade-offs  of 
using  scaled  systems.  Such  techniques  are  very  useful  to  practitioners 
of  the  scaled  systems  methodology  in  determining  the  cost/effort/ schedule 
feasibility  of  applying  scaled  systems  techniques.  The  primary  objective 
of  break-even  analysis  is  to  determine  an  overall  system  scale  factor,  or 
ratio,  such  that  the  total  estimated  development  cost  of  both  the  scaled 
system  and  its  full-scale  counterpart  "breaks  even"  with  the  estimated 
cost  of  a  full-scale  implementation  effort  without  the  benefit  of  a 
scaled  system.  This  break-even  point  is  important  because  it  reveals 
that  a  scaled  t.  /stem  built  on  a  smaller  scale  relative  to  the  break-even 
scale  factor  should  result  in  overall  project  savings;  conversely,  if  the 
break-even  scale  factor  cannot  be  achieved,  then  total  project  costs  may 
well  be  expected  to  exceed  the  cost  of  a  traditional  implementation 
approach.  Of  course,  this  additional  estimated  cost  is  subject  to 
justification  based  upon  such  factors  as  a  reduction  in  total  project 
risk,  the  achievement  of  design  or  performance  goals,  the  development  of 
a  user-friendly  interface,  or  seme  other  applicable  success-oriented 
criteria . 

In  "Scaled  Systems  Cost  Effectiveness",  break-even  points  were 
determined  through  parametric  analysis  using  an  interactive  software  cost 
estimation  model .  In  that  study,  the  break-even  pxaints  were  found  to  be 
cost  model  dependent  and  displayed  a  static  relationship  with  system  size 
across  a  wide  range  of  system  sizes.  This  static  relationship  is 
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attributed  to  the  cost  estimation  relationships  (CERs)  internal  to  the 
model  and  may  or  may  not  hold  in  actual  practice. 

Through  subsequent  research,  it  was  determined  that  break-even 
points  could  be  directly  calculated  frcm  productivity  coefficients  like 
the  Doty  and  the  Walston  and  Felix  environmental  factors.  This 
calculation  consists  of  simply  subtracting  the  inverse  of  the  factor  frcm 
the  value  of  one.  The  resulting  value  quantifies  two  things.  First,  it 
maintains  the  relative  measure  of  importance  the  environmental  factor 
originally  quantified  to  its  potential  inpact  on  a  software  development 
effort.  Secondly,  and  most  important  to  scaled  system  oost/benefit 
analysis,  the  resulting  factor  quantifies  a  break-even  value  for 
determining  potential  scaled  system  cost  effectiveness.  This  break-even 
value  represents  the  break-even  scale  factor  for  an  implementation  effort 
characterized  by  a  scaled  system  bearing  the  burden  of  the  negative 
impact  of  the  environmental  factor  while  the  full-scale  counterpart 
realizes  the  benefits  resulting  frcm  removal  of  the  negative  burden. 
Because  these  factors  rank  environmental  factors  by  overall  system 
break-even  scale  factors,  they  inform  the  scaled  system  practitioner  of 
the  freedom  of  constraints  he  has  in  the  construction  of  the  scaled 
system  providing  the  benefits  of  the  scaled  system  include  removal  of  the 
negative  impacts  arising  through  the  particular  environmental  factor. 
This  freedom  factor  reflects  the  constraints  placed  upon  the  scaled 
system  in  terms  of  size,  effort,  and  cost.  Subsequently,  these  freed  cm 
factors  have  been  termed  "Degrees  of  Scaling  Freed  cm" .  The  Walston  and 
Felix-derived  factors  are  listed  in  Figure  3-31  while  the  corresponding 
Doty-derived  factors  appear  in  Figure  3-32. 
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Break-Even  Scale  Factor 


Resources  63% 

Ave.  Personnel  Experience  and  Qualifications  18% 

%  Dev'mt.  Programmers  v*ho  Participated  in  Func.  Design  Spec.  15% 

%  Utilization  of  Currently-available  friardware  11% 

Degree  of  Previous  Experience  with  Cper.  Computer  13% 

Degree  of  Previous  Experience  with  Programming  Languages  19% 

Degree  of  Previous  Experience  with  Appl.  Size  and  Complexity  17% 

Platform  48% 

Customer  Interface  Complexity  25% 

Degree  of  User  Participation  in  Req'mts.  Def.  11% 

Degree  of  Customer-Originated  Design  Changes  12% 

Degree  of  Customer  Experience  in  Application  Area  12% 

Utilization  46% 

Overall  Program  Design  Constraints  17% 

Core  Memory  Design  Constraints  21% 

Execution  Time  Design  Constraints  17% 

Security  42% 

Spcl.  Req.  for  Access  to  Dev'mt.  CPU  15% 

Amount  of  Cpen  Access  Time  to  Dev'mt.  CPU  17% 

Classified  Security  Environment  for  CPU  and  25%  of  Programs  &  Data  17% 

Stuctured  Techniques  41% 

Structured  Programming  14% 

Design  and  Code  Inspections  10% 

Top-dcwn  Development  12% 

Chief  Programmer  Teams  15% 

Complexity  41% 

Overall  Code  Complexity  16% 

Complexity  of  Application  22% 

Complexity  of  Program  Control  Flew  9% 

Code  Mix  32% 

Proportion  of  Code  classed  as  Non-Mathematical  and  I/O  Eh) matting  11% 

Proportion  R/T,  Interactive,  or  Time-Critical  Code  9% 

Proportion  of  Code  Intended  for  Delivery  16% 

Misc.  Items  *  Not  Calculated  * 

Proportion  of  Data  Base  Class-Items  to  1,000  LOC 
Proportion  of  Doc.  Pages  to  1,000  LCC 
Ratio  of  Staff  Size  to  Project  Duration  (People/Month) 

Figure  3-31.  Degrees  of  Scaling  Freedom  as  Derived  from  the  IBM/Vfelston 
and  Felix  Data 


Factor 


Break-Even  Scale 
F&ctor 


CPU  Time  Constraints  57% 
Concurrent  Software  and  Hardware  Development  55% 
Development  CPU  Different  from  Target  CPU  55% 
Detailed  Definition  of  Operational  Requirements  50% 
First  Software  Developed  on  CPU  48% 
Development  at  More  than  One  Site  43% 
Real-Time  Operation  40% 
Limited  Programmer  Access  to  Computer  33% 
Developer  Using  Computer  at  Another  Facility  30% 
CPU  Memory  Constraints  30% 
Special  Display  30% 
Development  at  Operational  Site  28% 
Changing  Operational  Requirements  5% 
Time-Share,  Interactive  Development  [decreases  effort]  N/A 


Note:  Break-even  scale  factors  cane  from  maximum  factors  for  the 
particular  environmental  attributes  listed  in  Figure  3-23, 

Figure  3-32.  Degrees  of  Scaling  Freedom  as  Derived  from  the  Doty  and 
Associates  Data 
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A  scenario  for  general  application  of  these  degrees  of  scaling 
freed  cm  follows: 

The  system  planners  scan  the  lists  of  environmental  factors  and 
determine  vhich  ones  characterize  the  particular  development  environment 
under  scrutiny.  These  factors  normally  cause  adverse  impacts  upon  a 
development  effort  and,  hence,  will  have  much  the  same  impact  on  the 
scaled  effort.  However,  since  the  scaled  effort  is  not  as  great  as  the 
full-scale  attempt,  the  absolute  value  of  the  penalties  imposed  by  the 
adverse  environmental  factors  are  much  less  for  the  scaled  effort. 
Ostensively,  these  negative  impacts  will  be  removed  and  thus  not  affect 
the  up- scaling  effort  due  to  lessons  learned  through  the  scaled  effort. 
The  overall  savings  resulting  from  this  process  contribute  toward  the 
cost  of  the  scaled  effort  and,  possibly,  to  total  project  savings.  The 
question  arises  -  how  small  nust  the  scaled  system  be  in  order  to  achieve 
cost  savings?  Obviously,  it  is  hoped  that  the  size  of  the  scaled  system 
will  not  be  so  constrained  that  its  operational  value  and  predictive 
ability  are  minimized.  The  answer  to  the  question  lies  in  the  degree  of 
scaling  freedom  values  like  those  listed  in  Figures  3-31  and  3-32.  As  a 
gross  surrogate,  the  system  planner  may  initially  simply  use  the  maximum 
of  the  applicable  values  as  the  break-even  system  scale  factor,  relying 
upon  the  others  to  "back-up"  this  estimate  and  to  add  to  the  measure  of 
confidence  in  its  use.  Adding  the  break-even  scale  factors  together  is 
not  recommended  as  such  a  technique  would  most  probably  tend  to  produce 
an  cverly-optimistic  break-even  scale  factor  estimate.  Root  mean  square 
analysis  may  be  applicable  to  the  situation.  To  determine  the  root  mean 
square,  the  analyst  calculates  the  square  root  of  the  sun  of  the  squares 
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of  all  the  degrees  of  freedom  values  which  correspond  to  the 
environmental  attributes  existing  in  the  proposed  effort. 
Mathematically,  the  root  mean  square  calculation  looks  like  this: 

of  applicable  environmental  factors 
Environmental  factor*  T  2 

i  =  1 

The  validation  of  such  an  analytical  technique  is  beyond  the  scope 
of  this  research  and  must  be  left  to  future  research  of  the  application 
of  the  scaled  system  methodology,  as  presented  here,  to  actual  systems. 
At  least  these  degree  of  scaling  freed  cm  factors  cure  based  upon  actual, 
credible  empirical  data.  Again,  a  very  conservative  break-even  scale 
factor  for  a  potential  scaled  system  development  effort  can  be  obtained 
frcm  the  maximum  environmental  factor  degree  of  freedom  value  listed  in 
the  tables. 

The  Scaled  System  Cost  Effectiveness  study  determined  the  range  of 
the  degree  of  seeding  freedom  values  to  vary  frcm  10  to  50%.  This  means 
that,  based  upon  that  study,  scaled  systems  of  10-50%  overall  scale  can 
be  expected  to  achieve  cost  savings.  The  variation  can  be  attributed  to 
the  application  type  and  environmental  factors  present.  These  results 
are  not  altogether  inconsistent  with  the  IBM/Walston  and  Felix-derived 
degrees  of  freedom,  which  range  frcm  9-63%,  nor  the  Doty-derived  degrees 
of  freedom,  which  range  frcm  5-57%. 

Based  upon  this  study  of  empirical  data,  a  general  rule  of  thumb  can 
be  derived:  In  order  to  achieve  cost  effectiveness,  a  scaled  system 
would  most  probably  have  to  be  scaled  by  at  least  50%;  however,"  in  order 
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for  the  scaled  system  to  retain  any  predictive  value,  the  scaled  system 
should  not  be  seeded  to  less  than  10%.  Obviously,  the  verification  of 
this  general  rule  can  be  achieved  only  through  actual  practice  of  the 
scaled  system  methodology  in  a  very  controlled  and  carefully  documented 
manner.  The  resulting  analysis  of  the  data  thus  provided  will  certainly 
contribute  toward  bettering  the  methodology  and  enhancing  these 
predictive  measures.  It  is  encouraging  that,  at  this  point,  the  evidence 
suggests  scaled  systems  built  to  as  large  as  50%  of  the  actual  system  can 
indeed  provide  cost  savings  as  well  as  ensuring  the  production  of  quality 
software  systems. 

3.4  Cost  Benefits 

The  estimation  of  implementation-dependent  cost  benefits  resulting 
from  the  use  of  scaled  systems  techniques  relies  heavily  upon  the 
capabilities  of  current  cost  estimation  models  and  methodologies  as  well 
as  the  subjective  analysis  performed  by  the  experienced  system  planner. 
A  general  familiarity  with  the  operation  and  capabilities  of  currently 
available  software  cost  estimation  models  is  therefore  required  of  the 
potential  scaled  system  practitioner.  Accordingly,  a  general  discussion 
of  current  cost  estimation  model  state-of-the-art  is  provided  in  this 
section  to  familiarize  the  reader  with  these  models  and  their  use.  As  a 
conclusion  of  this  section  and  report,  a  case  study  of  an  actual 
intelligence  system  is  presented  to  serve  as  a  model  for  subsequent 
scaled  system  feasibility/cost  benefit  analysis. 

3.4.1  Life-Cycle  Cost  Estimation  Models 

Estimation  of  the  costs  and  schedule  for  software  development  is 
crucial  to  accomplishing  effective  planning,  budgeting,  and  evaluation 
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activities  within  an  organization.  These  are  the  activities  that  in  the 
world  of  software  project  management  have  long  been  documented  as 
historical  problem  areas.  Optimistic  cost  projections  along  with  gross 
errors  have  contributed  to  severe  budget  and  schedule  cwerrms. 

The  importance  of  software  project  life-cycle  cost  models  has  long 
been  recognized  in  the  process  of  malting  viable  software  cost  estimates 
and  tl«  efficient  allocation  of  resources.  This  does  not  mean  that  total 
reliance  should  be  placed  upon  the  cost  model  to  singly  accomplish  cost 
estimating.  It  must  be  accepted  in  the  context  of  a  comprehensive  cost 
estimation  strategy  where  the  cost  model  is  viewed  as  a  tool  for  the 
competent  cost  analyst.  Its  value  is  derived  from  the  imposition  of  a 
disciplined  and  structured  framework  that  carpels  the  analyst  to  consider 
and  take  into  account  all  significant  factors  influencing  software 
development  costs.  The  software  life-cycle  cost  model  is  a  valuable  tool 
that  can  account  for  complex  nonlinear  relationships  between  seemingly 
random  data  through  use  of  automated  mathematical  and  statistical 
analytical  techniques. 

3.4. 1.1  Cost  Estimation  Model  Methodology 

In  general,  cost  model  operation  involves  calibration  to  historical 
experience,  input  parameter  determination,  model  operation  and  cost 
analysis/presentation,  and  risk  assessment. 

(1)  Calibration 

Representation  of  the  developmental  environment  is  the  single 
most  important  factor  determining  the  model's  applicability  to  projecting 
cost  behavior.  The  cost  model  must  be  either  carefully  designed  to  model 
cost  behavior  vdthin  that  environment  or  have  the  capability  *tc  adapt 


itself  to  reflect  the  cost  characteristics  of  any  specified  environment. 
Consequently,  the  oaimercially  available  cost  models  are  supplied  with 
the  facility  to  be  calibrated  to  a  specific  environment.  This 
calibration  process  is  crucial  to  the  ability  of  the  model  to  project 
cost  behavior  within  a  particular  environment. 

The  objective  of  the  calibration  process  is  to  tailor  the  model 
through  the  use  of  an  organization's  historical  cost  data  ao  that  the 
model's  predictive  ability,  within  the  organization,  is  enhanced.  Most 
models  have  special  functions  to  determine  the  variables  for  this  purpose 
and  make  them  available  to  the  user,  in  the  form  of  an  input  parameter, 
for  subsequent  use  by  the  model .  This  input  parameter  is  a  global 
descriptor,  reflecting  the  professional  quality  and  probl an- solving 
capabilities  of  the  organization's  technical  and  administrative  staffs. 

TVo  commercially  available  cost  estimation  models,  PRICE  S  and 
SLIM,  each  have  these  inputs.  One  reason  such  variables  are  made 
available  to  the  user  is  that  past  development  conditions  may  no  longer 
hold.  The  user  may  subsequently  find  it  necessary  to  adjust  these 
variables  in  order  to  achieve  more  realistic  cost  estimates.  Examples  of 
such  conditions  include  the  adoption  of  newer,  more  up-to-date  structured 
development  practices,  the  acquisition  of  more  powerful  development  tools 
and  facilities,  or  an  increase  in  the  skill  level  of  the  organization's 
personnel  resulting  from  prior  experience  (the  converse  of  this 
condition,  the  decrease  in  skill  level  due  to  personnel  turnover  and  new 
hires,  is  also  possible). 

(2 )  Input  Parameter  Determination 


There  are  so  many  factor*  that  can  potentially  effect  program 
development  costs  that  it  is  virtually  impossible  (and  impractical)  for  a 
cost  model  to  attempt  to  deal  with  them  all  cn  an  individual  basis.  This 
problem  has  been  tackled  by  grouping  related  factors  and  representing 
each  category  by  a  generalized  model  input  parameter.  This  is  where 
structured  systems  thinking  is  required  -  resulting  in  the  disaggregation 
of  a  large,  ambiguous  task,  into  a  structured  decomposition  of  severed 
smaller,  more  manageable  tasks  for  cost  estimation.  It  is  therefore 
necessary  for  the  cost  analysis  team  to  aggregrate  the  effects  of  all 
related  factors  so  that  their  combined  effects  may  be  synthesized  into 
the  appropriate  mix  of  model  input  parameters. 

The  schema  for  input  parameter  estimation  includes  such 
all-encompassing  project  considerations  as  staff  capabilities,  product 
attributes,  application  requirements,  environmental  factors,  development 
practices,  and  management  policies.  Intelligence  systems  built  to 
military  specifications  would  include,  as  an  example,  general  provisions 
for  product  attributes  6uch  as  real-time  operation,  modularity,  and 
strict  documentation  standards  for  usability  and  maintainablilty; 
application  requirements  for  testability,  quality  assurance,  and 
reliability;  environmental  factors  such  as  secure  developmental  and 
operating  facilities;  development  practices  such  as  structured  design 
walk-throughs  and  close  progress  tracking;  and  management  policies  in 
regards  to  staffing  and  the  scheduling  of  project  milestones. 

( 3 )  Model  Operation  -  Output  Analysis,  Presentation 

Once  the  cost  analyst  team  has  collected,  analyzed,  and 
synthesized  all  of  the  pertinent  cost  data  relevant  to  the  proposed 
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development  effort  into  the  appropriate  model  input  parameters  (and 
calibrated  the  model,  if  sufficient  historical  data  is  available),  they 
are  ready  to  use  that  data  to  exercise  the  model .  The  model '  s  output 
consists  of  milestone  schedules,  staff- loading  profile  charts,  and  sane 
measure  of  cost  expressed  either  in  terms  of  personnel  effort  or  dollars. 

In  addition,  the  model  may  break  down  the  expenditure  estimate  by  labor 
category  such  as  technical,  managerial,  coding,  documentation,  etc. 
and/or  functional  category  such  as  design,  code,  test  and  integration, 
maintenance,  etc.  In  the  event  that  the  model  encounters  a  set  of  input 
data  *«*iich  is  inconsistent  with  the  formulated  guidelines,  it  will  also 
produce  the  appropriate  warning  or  error  messages. 

(4)  Estimate  Risk  Assessment 

Che  aspect  of  the  cost  estimate  vhich  the  particular  model  may 
address  is  the  measure  of  uncertainty,  or  risk,  associated  with  the  cost 
estimate.  This  is  usually  either  a  standard  statistically-derived 
measure  upon  the  estimate,  such  as  a  root  mean  square  (standard 
deviation)  value,  or  the  preparation  of  a  sensitivity  profile  obtained 
through  parametric  analysis  of  the  input  parameters.  In  order  to  realize 
the  full  meaning  and  value  of  such  risk  measures  the  assumptions  and 
method  by  vhich  they  are  derived  mist  be  understood  by  the  cost  analyst. 
3.4. 1.2  Cost  Estimating  Considerations 

An  important  consideration  often  overlooked  or  misunderstood  is  that 
cost  estimation  models,  in  and  of  themselves,  are  not  a  panacea  to  the 
general  problems  inherent  in  software  cost  estimating.  This  has  been 
documented  by  user  groups  that  have  actual  experience  with  cost 
estimation  and  evaluation  methodologies.  These  groups  stress  the  value 
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and  importance  of  juxtaposing  the  results  obtained  from  the  automated 
cost  estimation  models  with  the  sound  judgement  and  professional 
experience  of  the  available  technical  staff  members  to  produce  credible 
and  realistic  cost  estimates.  This  is  necessary  because  the  cost  models 
quantify  past  experience.  Cost  projections  based  upon  such  historical 
data  include  a  certain  degree  of  risk  because  of  the  advances  which  are 
occurring  in  the  software  industry. 

Another  source  of  error  in  the  models  is  their  extreme  sensitivity 
to  relatively  small  adjustments  in  their  input  parameter  mix.  This  is 
because  the  underlying  software  cost  relationships  are  characterized  by 
complex  exponential  functions  determined  by  the  intricate 
interrelationships  of  the  various  input  parameters  themselves.  This 
problem  is  particularly  acute  when  using  input  parameters  that  fall 
outside  the  ranges  for  which  the  model  was  calibrated. 

Solutions  to  these  problems  include  fine-tuning  of  the  input 
parameter  mix  to  match  preconceived  cost  targets  as  well  as  assessing  the 
already  mentioned  existence  of  wide  variations  (risk)  in  the  estimates. 
These  variations  raise  an  important  philosophical  point:  that  the 
dynamics  of  software  cost  estimating  are  such  that  obtaining  high 
accuracy  in  the  point  estimates  is  neither  possible  nor  desirable  due  to 
the  calculational  variation  which  is  present. 

The  utility  of  the  costing  model  lies  in  the  structure  and 
discipline  imposed  upon  the  costing  process.  The  reliability  of  the 
resulting  estimate  is  dependent  on  the  assumptions  which  produced  it. 
All  estimates  must  be  scutinized  by  experienced  cost  analysts  to  ensure 
that  the  results,  along  with  the  underlying  assumptions,  are  reasonable. 
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3.4. 1.3  Assessment  of  Cost  Model  State-of-the-Art 


Growing  acceptance  of  life-cycle  models  arises  not  only  because  of 
their  potential  to  serve  management  in  the  planning,  programing, 
monitoring,  and  evaluation  of  software  production  efforts  (thrcwgh  the 
provision  of  schedules,  manpower  allocation  profiles,  and  cost 
estimates),  but  also  because  of  their  merits  in  creating  a  structured  and 
disciplined  approach  to  the  estimation  and  evaluation  of  cost  estimates 
to  serve  decision-makers'  needs.  Nevertheless,  attention  to  the 
capabilities,  limitations,  characteristics,  and  purpose  of  software 
life-cycle  cost  models  must  be  fixed  in  the  minds  of  those  who  use,  as 
well  as  those  who  develop  them. 

Note  the  current  state  of  development  of  the  estimating  technology. 
Software  life-cycle  cost  models  are  in  an  infant  stage  of  their  product 
life-cycle.  They  are  still  growing  in  acceptance  and  popularity. 
Advances  are  being  made  in  their  underlying  theoretical  formulation  as 
well  as  in  refinement  of  their  operational  and  functional 
characteristics . 

The  skepticism  of  those  who  consider  these  models  to  be  expensive 
frills  is  not  altogether  unfounded.  The  cost  of  the  ocmmercially 
available  models  is  artificially  inflated  due  to  the  profits  required  by 
the  vendors  to  recoup  their  large  investment  in  research  and  development . 

However,  as  in  all  newly  marketed  technologies,  it  is  not  unrealistic  to 
expect  that,  as  economies  of  scale  and  competitive  market  forces  cane 
into  play,  the  prices  should  decline. 

As  for  the  life-cycle  models  themselves,  we  can,  for  convenience, 
classify  them  into  two  g<  leral  groups:  ccnmercial  general  purpose  and 


academic  special  purpose. 

(1)  Ccrmercial  General  Purpose  Cost  Models 

Under  the  category  of  ccrmercial  general  purpose  models,  we 
find  two  popular  models.  One  is  RCA's  PRICE  S,  vhich  is  actually  a 
member  of  a  family  of  three  related  cost  estimation  models  \»hich  also 
includes  PRICE,  a  hardware  manufacturing  cost  estimation  model,  and  PRICE 
SL,  a  software  life-cycle  cost  estimation  model.  The  other  is  SUM,  a 
product  available  for  lease  frcm  Quantitative  Software  Management,  Inc., 
headed  by  Lawrence  Putnam.  Mr.  Putnam  is  credited  with  making,  and 
publicizing,  significant  advances  in  the  area  of  the  theory  of  software 
costing  and  estimation. 

Although  not  ccrmercial ly  available,  a  new  integrated  approach 
to  cost  estimating  and  evaluation  occurs  in  the  form  of  a  system  vhich 
utilizes  both  the  PRICE  S  and  a  modified  version  of  the  Putnam  model  to 
evaluate  cost  proposals  at  the  Space  and  Missile  Systems  Organization's 
(SAM90)  Software  Programs  Office  (SPO)  at  Los  Angeles,  California.  The 
system  is  implemented  on  a  Hewlett-Packard  Series  3000  and  supports 
generalized  pre-processing  interpretation  of  standardized  input  formats 
for  subsequent  input  to  both  models  as  well  as  graphics-oriented, 
post-processing  facilities  for  both  of  the  model's  outputs. 
Additionally,  the  system  has  an  integral  data  base  management  system  and 
a  financially-oriented  report  generator. 

(2)  Academic/Special  Purpose  Models 


Widely  discussed  in  the  research  literature  and  at  various 
conference  and  workshop  proceedings  is  a  collection  of  software 
estimation  models  which  result  principally  frcm  special  pufpose  and 


acadanic  pursuits.  These  models  have  a  limited  range  of  applicability 
since  they  reflect  specific  environments,  a  limited  scope  of 
applications,  and/or  products  of  similar  sizes  and  attributes.  Thus, 
they  are  not  considered  general  purpose  cost  estimation  models. 

Perhaps  the  most  frequently  referenced  article  on  the  subject 
of  quantifying  software  development  productivity  rates  and  estimation 
ratios  is  that  of  IBM's  Walston  and  Felix,  which  first  appeared  in  an  IBM 
technical  journal  in  1977  [ref.  24],  In  the  literature  we  additionally 
finSj  many  papers  describing  and  cementing  on  the  theories  of  cost 
estimation  set  forth  by  Mr.  Putnam,  which  is  principally  an  extension  of 
the  work  performed  by  still  another  IBM  researcher,  Peter  Norden  [ref. 
15].  From  efforts  expended  by  and  under  the  direction  of  Victor  Basil i, 
[refs.  3,  4,  5,  &  6],  of  the  Computer  Sciences  Department  at  the 

University  of  Maryland,  comes  a  rich  proliferation  of  articles, 
dissertations,  and  research  findings  encompassing  a  vast  array  of 
software  engineering  topics,  including  cost  and  oost-relatec  modeling. 

The  government  itself  is  active  in  the  areas  of  software 
engineering  and  development  of  cost  estimating  techniques  with  research 
grant  activity  through  organizations  such  as  the  NASA/ Goddard  Space 
Flight  Center,  [refs.  2,  6.  4],  the  ftxne  Air  Development  Center  (RADC) , 
[ref.  7],  and  various  educational  institutions,  including  the  Lhiversity 
of  Maryland.  This  activity  is  highlighted  by  the  published  findings  of 
Putnam  [ref.  17-21]  and  Doty  Associates,  Guidelines  for  Improved  Software 
Cost  Estimating,  [ref.  7],  as  well  as  through  the  personal  and 
professional- level  contributions  made  by  cost  analysts  such  as  William 
Lasher  [ref.  9]  of  the  already  mentioned  SAMSO  SP3. 
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It  is  upon  such  research  efforts  and  activity  that  LNCC  has 
based  the  development  of  its  own  version  of  a  software  life-cycle  cost 
estimation  model.  This  software  life-cycle  cost  estimation  model  was 
targeted  for  in-house  operation  on  low-cost  microprocessor  hardware. 

3.4.2  Estimation  of  Scaling  Benefits 

In  the  context  of  scaled  system  development,  cost  estimation  must 
take  into  account  the  differences  that  exist  between  the  scaling  and 
up-scaling  environments.  Certainly,  there  is  a  host  of  factors  to  take 
into  consideration.  Generally,  the  scaling  environment  will  resemble 
that  of  most  other  developments  not  using  the  scaled  systems  approach. 
Potential  differences  in  the  scaling  environment  might  include  the 
benefits  of  such  scaled  system  facilities  as  special  low-cost, 
general-purpose  hardware  and  software  development  tools  specifically 
tailored  to  scaled  systems  development  and  a  greater  degree  of  technical 
user  orientation.  Aside  from  such  advantageous  environmental  niceties, 
however,  the  scaled  system  development  environment  will  most  generally  be 
subject  to  the  same  negative  environmental  impacts  as  most 
start- from- scratch  development  efforts. 

The  positive  impacts  will  be  realized  in  the  unsealing  environment. 
The  unsealing  environment  will  be  much  more  conducive  to  system 
development  due  to  the  lessons  learned  through  the  scaling  effort 
regardless  of  the  degree  of  success  obtained  during  the  scaled  effort. 
Characteristics  of  unsealing  environments  include  firm  specifications  of 
user-oriented  functional,  operational,  and  design  requirements.  In 
addition,  the  unsealing  effort  will  probably  benefit  from  any  inventory 
of  design  documentation  and  source  code  accumulated  through  the  scaling 
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effort.  Also,  there  should  be  an  optimal  hardware  configuration  chosen 
to  precisely  match'  the  needs  of  the  particular  application  based  upon  the 
experience  of  the  scaled  system  development .  By  the  time  up-scaling 
takes  place,  the  task  breakdowns  and  schedules  will  have  been  well 
defined  and  laid  out  and  all  the  parties  involved  will  have  a  strong, 
unified  concept  of  the  end  result  and  will  be  in  agreement  as  to  v^at  the 
cannon  goals  of  the  system  are.  In  short,  an  optimal  environment  will 
have  been  developed  complete  with  fully-detailed  descriptions  of  the 
tasks  at  hand  and  the  product  to  develop  —  all  unencumbered  by  the 
greatest  proponents  of  development  project  risk  and  schedule  slippage  - 
functional  and  technical  uncertainty. 

10  quantify  the  impacts,  benefits,  and  trade-offs  inherent  in  these 
environments,  factors  such  as  those  supplied  by  the  Doty  and  Walston  and 
Felix  research  efforts,  as  well  as  this  effort,  are  available.  Seme  cost 
estimation  models  can  specifically  account  for  these  factors,  others  wall 
have  to  be  adjusted  or  modified  to  do  so.  At  the  highest  level,  these 
factors  can  augment  experienced  technical  manager's  subjective 
assessments  and  their  resulting  estimations  of  a  project’s  cost, 
schedule,  and  risk. 

3.4.3  A  Case  Study  for  Scaled  Systems 

In  terms  of  examining  actual  intelligence  systems  which  could 
benefit  frem  application  of  the  scaled  systems  development  approach,  this 
research  had  the  opportunity  to  explore  the  possibilities  of  one  such 
system.  Fortunately,  this  system  is  somewhat  unique  in  that  a 
corresponding  scaled  version  exists  and  is  operational.  Although  this 
scaled  version  was  developed  after  the  full-scale  implementation,  its 
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existence  provides  a  ■'hands-on"  feel  for  examining  the  application  of 
analytical  techniques  suitable  to  the  scaled  systems  methodology.  This 
seemingly  "reversed- scaling"  approach  to  system  development  resulted  frcm 
different  motivations  for  this  particular  scaled  system  and  this;  must  be 
kept  in  mind  so  as  not  to  introduce  any  bias  in  our  case  study.  As 
stated,  this  scaled  system  realized  benefits  from  the  full-scale 
implementation  which  was  operational,  or  at  least  semi-operational,  at 
the  time  the  scaled  implementation  took  place.  This  scaled  system,  the 
Indications  and  Vfeming  Training  System  (IWTS),  was  motivated  by  the  need 
for  a  simulator  to  train  intelligence  analysts  on  how  to  operate  an 
interactive  terminal-oriented  communications,  command,  and  control 
intelligence  system  -  the  WIC. 

3.4. 3.1  Background 

After  the  NMIC  achieved  an  initial  operating  capability  in  1978, 
INCO,  INC.  responded  to  the  need  for  user  training  with  a  low-cost, 
stand-alone  analyst  training  system  complete  with  its  own  special-purpose 
hardware.  As  stated,  this  system  is  referred  to  as  the  IWTS  and  it  was 
delivered  in  early  1979.  Because  this  training  system  appears  to  the 
user  as  the  "real"  NMIC  system  and  it  models  100%  of  the  full-scale 
system' 8  functional  and  operational  characteristics,  we  take  the  view 
that  it  is  "the  scaled  system  that  could  have  been" . 

3.4. 3.2  System  Development  Data  Collection 

As  the  implementor  of  the  IWTS,  INCO  had  sufficient  data  readily 
accessable  concerning  its  development  environment,  operational 
characteristics,  and  required  development  effort  and  costs. 
Uh fortunately,  the  same  was  not  entirely  true  of  the  actual  fWIS  system; 
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however,  due  to  INCO's  past  and  present  involvement  with  the  design, 
development,  maintenance,  and  enhancement  of  the  tWIC  system  as  well  the 
completion  of  the  M*1IC  Functional  Analysis/ Enhancement  Study  [ref.  14], 
adequate  data  was  available  to  prepare  this  case  study. 

3. 4. 3. 3  The  NMIC  System 

Upon  initial  familiarization  with  the  NMIC  system,  it  appeared  to  be 
a  nightmare  for  system  implementors,  having  nearly  all  of  the  adverse 
characteristics  possible  of  a  state-of-the-art  intelligence  system.  Of 
course,  many  of  these  characteristics  incrementally  complicate  the 
development  of  such  a  system  and  drive  development  cost,  schedule, 
effort,  and  risk  accordingly  higher  -  a  perfect  candidate  for  application 
of  the  scaled  system  development  approach. 

Through  this  research  effort,  many  of  these  characteristics  could  be 
identified  and  classified  as  either  new  hardware  design  and  development, 
new  software  system  design  concepts,  or  intelligence  system  dependent 
factors . 

As  for  new  hardware,  the  NMIC  boasted  a  wide  assortment  of 
state-of-the-art  hardware  concepts.  It  was  envisioned  to  be  a  clustered 
network  of  multiple  minicomputers  interconnected  by  a  new  bus  technology 
and  system  architectural  concept.  It  was  to  ccrmunicate  with  its  users 
through  a  totally  new,  concurrently  developed  and  programmed,  intelligent 
dual-screen  full  function  video/graphics  terminal.  There  were  to  be 
fifty  such  terminals  dispersed  geographically.  Being  a  message  and 
carmunications  system,  it  was  to  process  a  ntmber  of  real-time  inputs  and 
outputs . 
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As  for  new  systems  design  concepts,  the  tMIC  incorporated  several 
new  ideas  based  tpon  motivations  for  system  reliability,  automatic  system 
error  and  failure  recovery,  user  flexibility,  and  high-level  access  to  a 
number  of  other  existing  intelligence  systems  and  networks.  The  basic 
internal  functions  of  the  system  were  to  be  distributed  throughout  the 
minicanputer  network;  hence,  reliability  and  functionality  were  enhanced 
through  the  provision  of  each  minicanputer  to  perform  its  corresponding 
function.  If  a  minicomputer  failed,  only  that  function  would  be 
incapacitated  -  but  what  about  the  recovery  of  that  function?  In 
anticipation  of  such  an  event,  the  bWIC  system  originally  incorporated  a 
system  design  concept  of  “fail-back  and  recovery”,  whereby  another  member 
of  the  computer  network  would  recover  the  function  lost  by  the  failed 
computer.  Since  real-time  processing  was  an  integral  part  of  the  system, 
provisions  for  the  design  and  coding  of  much  time-critical  (assembler 
language)  code  had  to  be  made. 

Additionally,  the  NMIC  system  was  confronted  with  a  variety  of 
factors  typical  of  an  intelligence  system.  First,  the  development  and 
operational  environments  were  characteristically  security  sensitive.  The 
members  of  the  developmental  and  operational  staffs  were  thus  required  to 
have  or  obtain  the  appropriate  security  clearances  in  order  to  work  on 
and  have  access  to  the  system.  Such  an  environment  increases  development 
costs  because  it  requires  additional  controls  to  be  imposed  upon  the 
development  facilities  and  personnel  by  way  of  locks,  guards,  logs,  etc. 
and  because  personnel  may  rot  be  available,  may  have  to  be  unproductively 
employed  while  waiting  for  their  clearances,  or  may  have  to  be  selected 
on  the  basis  of  clearance  rather  than  skills.  Vhile  such  an  environment 
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is  necessary  in  order  for  a  system  and  its  staff  to  handle  classified 
material,  it  creates  the  opprotunity  for  scaling  of  developmental  cost  by 
developing  technical  concepts  in  a  lower-cost,  non-secure  environment . 
Ftor  example,  the  "fall-back  and  recovery"  concepts  and  algorithms  for  the 
M>1IC  system  have  been  developed  outside  the  secured  environment.  Also, 
due  to  its  potential  importance  to  the  national  security,  the  system 
development  required  the  most  stringent  of  management,  design, 
documentation,  and  configuration  control  practices  as  well  as  a  large 
degree  of  operational  functionality,  reliability,  and  robustness. 

The  varying  degrees  of  success  the  tWIC  achieved  in  meeting  all  of 
its  functioned  and  design  goals  are  largely  a  matter  of  record  and  not  of 
great  importance  to  this  case  study.  Wrat  is  of  concern  here  is  the 
measurement  of:  (1)  the  negative  productivity  impacts  the  NMIC 
development  sustained  in  the  face  of  its  developmental,  environmental, 
and  operational  obstacles,  and  (2)  the  benefits  a  scaled  prototype  might 
have  contributed  to  the  achievement  of  the  tWIC's  overall  objectives  in 
terms  of  functionality,  budget,  and  schedule. 

3.4. 3.4  The  IWTS  System 

The  NMIC  did  not  have  the  luxury  of  a  cost-effective  scaled 
prototype  version  to  facilitate  its  specification,  design,  and 
development.  If  it  did,  however,  the  resulting  scaled  system  would  most 
probably  have  closely  resembled  the  INCO  IWTS  trainer.  A  requirement  of 
the  IWTS  trainer  was  to  fully  provide  and  support  all  of  the  WIC 
system's  functional  and  operational  features  at  the  user  terminal  level. 
The  IWTS  was  developed  on  low-cost,  stand-alone,  microprocessor  based 
hardware  and,  as  such,  provided  very  little  in  the  way  of  the  actual 
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full-scale  system's  capabilities  to  receive,  rout,  and  send  "real" 
messages.  Such  processes  were  emulated,  hcwever,  through  a  pre- stored 
set  of  messages  reflecting  general  scenarios  of  carmunications  that  the 
potential  tWIC  user  would  normally  encounter.  The  end  result  was  that 
the  user  operating  the  IWTS  terminal  has  virtually  no  idea  that  he  is 
actually  using  the  trainer;  but  rather  has  the  impression  that  he  is 
logged ~cn  to  the  full-scale  1MIC  system.  A  diagram  of  the  IWTS  design  is 
illustrated  in  Figure  3-33. 

Of  importance  to  this  case  study  is  the  measurement  of  the  degree  of 
scale  the  IWTS  trainer  achieved  relative  to  the  JWIC  system,  its  relative 
cost,  and  potential  predictive  ability.  The  amount  of  potential  savings 
the  tWIC  could  have  actually  realized  through  the  use  of  a  system  like 
the  IWTS  to  serve  as  a  scaled  prototype  development  testbed  is  however 
basically  a  matter  of  conjecture  as  such  an  estimate  could  only  be  based 
upon  hindsight. 

Hardware  scale  factor  is  perhaps  least  difficult  to  compute.  Here, 
dollar  costs  are  used  as  the  metric  since  they  are  most  readily 
available,  tangible,  and  understandable  in  nature  as  opposed  to  seme 
ambiguous  measure  such  as  hardware  system  capacity  or  power.  Belying  on 
a  figure  obtained  from  the  results  of  INCO’s  DIIS  FA/ES  final  study 
report,  the  estimated  hardware  cost  of  the  current  1MIC  configuration 
approximates  $3.5  million.  This  figure  does  not,  hcvever,  reflect  final, 
enhanced  configuration  hardware  costs  of  roughly  $9  million.  These  costs 
reflect  the  use  of  current  state-of-the-art  minicomputers  and  their 
associated  high-speed  I/O  peripherals,  as  well  as  the  U-1652  dual-screen 
terminal.  In  contrast,  the  IWTS,  while  using  the  same  user  terminal, 
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makes  use  of  relatively  lower-cost  microprocessor-based  hardware.  The 
IWTS  basic  hardware  configuration  cost  is  in  the  neighborhood  of  $50K, 
with  most  of  the  cost  being  attributed  jo  the  cost  and  availability  of 
the  U-1652  dual-screen,  "TEMPEST"-certif ied  (electronic  emanation) 
terminal.  Using  the  cost  of  the  current  NMIC  configuration,  the 
resulting  hardware  cost  scale  factor  is  computed  through: 

§50,000 

-  ~  1  1/2  %  Scale  Factor  for  Hardware  Cost 

$3,500,000 

Of  course,  a  scaled  development  approach  would  have  required  the 
purchase  of  both  hardware  configurations,  increasing  total  project  costs. 

A  consideration  would  be  the  security  classification  of  the  micro 
hardware  in  order  to  implement  seme  of  the  classified  features  of  the 
final  system;  failure  of  the  hardware  to  obtain  the  prerequisite 
certifications  would  necessitate  the  use  of  software  emulation  techniques 
to  implement  similar,  non-classified  versions  of  the  functions. 

Software  sizing  proves  to  be  a  much  more  difficult  task.  Bor  one 
thing,  the  word  sizes  and  instruction  lengths  of  minicomputers  differ 
from  microcomputers.  This  situation  presents  a  number  of  problems. 
First,  assembler  source  language  statement  counts  and  required  core 
memory  sizes  are  not  directly  comparable.  Secondly,  the  statement  count 
for  the  micro  is,  in  all  probability,  inflated  ?■?  ccrrpared  to  the  mini 
since  more  instructions  are  required  to  perform  similar  tasks  because  the 
relative  computational  and  logical  power  of  the  micro  is  not  as  great  as 
that  of  the  mini.  The  microcomputer  programmer  finds  himself  generating 
several  primitive  instructions  on  the  micro  where  a  single  instruction 
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might  accomplish  the  same  result  cn  the  more  powerful  mini. 

The  full-scale  MIC  system's  module  sizes  were  used  as  the  standard 
basis  on  which  to  compare  the  different  software  sizes  of  the  two 
systems.  In  this  way,  it  was  hoped  the  problems  of  the  two  system's 
differing  language  dialects  could  be  resolved  as  well  as  maintaining  same 
degree  of  consistency  in  the  software  sizing  analysis.  Subsequently,  it 
was  determined  that  much  of  the  IWTS  trainer’s  processing  was  performed 
in  the  NMIC's  “USS" ,  or  User  Support  Subsystem.  Accordingly,  the 
estimated  size  of  the  USS  served  as  a  surrogate  value  and  a  consistency 
check  on  the  trainer's  estimated  size.  Estimated  sizes  are  used  in  lieu 
of  the  prohibitively  long  process  of  actually  counting  source  statements 
from  listings  and  because  of  the  problems  inherent  in  adding  sizes  of 
differing  computer  language  dialects,  such  as  assembler  and  HOL.  In  the 
case  of  the  MIC,  the  assembly  language  used  in  the  bulk  of  the  system’s 
modules  was  PDP  MACRO-11  and  FORTRAN  was  used  where  an  HOL  was 
applicable.  Intel  8080  assembler  was  used  for  the  IWTS  with  HOL 
applications  prograntned  in  BASIC. 

The  end  result  of  the  sizing  analysis  resulted  in  the  sizes  of  the 
two  systems  being  set  to  15,000  source  lines  of  code  (SLOC)  for  the  IWTS 
and  80,000  for  the  NMIC.  These  sizes  resulted  primarily  frcm  assessments 
made  by  INCO  technical  staff  members  who  participated  in  the  development 
of  the  IWTS  and  MIC  systems  as  well  as  those  who  performed  the  DI IS 
FA/ES  study.  The  size  for  the  IWTS  was  intentionally  set  pessimistically 
high  and  the  size  for  the  MIC  optimistically  low  in  order  to  avoid  any 
bias  in  the  resulting  analysis.  The  results  of  the  computation  for 
software  size  scale  factor  follow: 
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15,000 


-  ~  20  %  Scale  for  Software  Size 

80,000 

The  tracking  of  actual  development  effort  frequently  escapee  the 
ability  of  most  organizations  as  the  means  of  data  collection  is  usually 
not  present  and  the  figures  get  absorbed  in  the  aggregation  of  total 
labor  hours  expended  throughout  the  entity.  For  the  analysis  of 
development  effort  scale,  the  Doty  cost  model  was  used  to  estimate  the 
amount  of  effort  expended  on  the  tMIC  because  the  actual  figure  was 
unobtainable.  Data  provided  by  managers  of  the  IWTS  project  was  used  as 
the  effort  measure  for  that  system.  To  maintain  consistency  and 
establish  a  common  means  of  measuring  effort,  the  Doty  model,  as 
programmed  into  INCO's  own  cost  model  (described  in  section  2.3),  was 
used  to  cross-check  the  manager’s  measures.  Hence,  the  model  was  used  to 
estimate  effort  measures  for  both  the  IWTS  as  well  as  the  NMIC. 
Surprisingly  enough,  the  model's  estimate  for  the  IWTS  was  in  very  close 
agreement  with  the  actual  figures.  This  result  increased  confidence  in 
the  use  and  applicability  of  the  Doty  model  to  this  particular  analysis. 

The  Doty  cost  model  estimated  the  IWTS  development  cost  to  be  on 
the  order  of  §225,000  over  ten  months;  in  contrast,  the  IWTS  management 
supplied  a  figure  of  $200,000  over  cne  year.  The  environmental  factors 
considered  for  this  Doty  run  are  listed  in  Figure  3-34.  The  JWIC 
estimate  came  out  to  be  roughly  $10  million  over  18  months  and  the 
factors  applicable  to  this  estimate  appear  in  Figure  3-35.  In  order  to 
estimate  the  cost  benefits  resulting  frcm  the  use  of  the  IWTS  system  as  a 
scaled  prototype,  the  tWIC  data  was  input  to  the  cost  model  again,  with 
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Fran  the  Doty  &  Associates  (RADC)  Studies: 


Please  Select  an  Application  Category: 

1  -  Utility  (OS) 

2  -  Ccrmand  &  Control  (c2) 

3  -  Scientific 

4  -  Business 

5  -  All  (Others  not  listed  above) 

6  -  EXIT  (Return  to  master  CM  menu) 

Selection  (1-5)?  2 

Estimated  Deliverable  Source  LOC  (1, 000's)?  15 
Please  input  a  yes/no  (Y/N)  response  to  each  of  these  14  questions: 


Special  display?  Y 

Detailed  definition  of  operational  req'mts?  Y 

Changing  operational  req'mts?  N 

Real  time  operation?  N 

CRI  memory  constraint?  Y 

CRJ  time  constraint?  N 

First  S/W  developed  on  CRI?  N 

Concurrent  development  of  ADP  H/W?  N 

Interactive  development  environment?  Y 

Off-site  development  computer  facilities?  Y 

On-site  development  computer  facilities?  N 

Development  computer  different  them  target  computer?  Y 

Multi-site  development  computer  facilities?  N 

Unlimited  programmer  access  to  computer  facilities? Y 

56.0  Person  Months  req'd  for  analysis,  design,  code,  debug,  test  and  checkout. 

(  Standard  error  cn  this  approximation  =41.1%  ) 

Estimated  schedule  duration  =  9.9  Months 

Continue  (Y  or  N)? 


Figure  3-34.  IWTS  Cost  Estimate 
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From  the  Doty  &  Associates  (RADC)  Studies: 


Please  Select  an  Application  Category: 

1  -  Utility  (OS) 

2  -  Ccrrmand  &  Control  (c2) 

3  -  Scientific 

4  -  Business 

5  -  All  (Others  not  listed  above) 

6  -  EXIT  (Return  to  master  CM  menu) 

Selection  (1-5)7  2 

Estimated  Deliverable  Source  LOC  (1,  f<30 '  s)?  80 
Please  input  a  yes/no  (Y/N)  response  to  each  of  these  14  questions: 


Special  display?  Y 

Detailed  definition  of  operational  req'mts?N 

Changing  operational  req'mts?  Y 

Real  time  operation?  Y 

CRJ  memory  constraint?  Y 

CRJ  time  constraint?  Y 

First  S/W  developed  cn  CRJ?  Y 

Concurrent  development  of  ADP  H/W?  Y 

Interactive  development  environment?  Y 

Off-site  development  computer  facilities? N 

On-site  development  acmputer  facilities?  Y 

Development  computer  different  than  target  computer?  N 

Multi-site  development  ocmputer  facilities?  N 

Unlimited  programmer  access  to  ocmpu-er  facilities?  N 

2115.1  Person  Months  req'd  for  analysis,  design,  code,  ddaug,  test  and  checkout. 

(  Standard  error  an  thi s  approximation  =41.1  %  ) 

Estimated  schedule  duration  =  18.1  Months 

Continue  (Y  or  N)? 


Figure  3-35.  1MIC  Cost  Estimate 
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the  adjustment  of  three  of  the  environmental  factors  to  account  for  the 
positive  impacts  resulting  from  the  use  of  the  scaled  system.  The  three 
factors  adjusted  were:  firmness  of  system  operational  specifications, 
absence  of  changing  operational  requirements,  and  the  absence  of  parallel 
hardware  development.  The  factors  input  to  the  Doty  Model  are 
illustrated  in  Figure  3-36.  Amazingly  enough,  the  Doty  cost  model 
provided  an  estimate  of  approximately  $3.5  million  for  the  NMIC 
development  resulting  from  the  benefit  of  a  scaled  system  -  an  estimated 
savings  of  approximately  $6.5  million  dollarsl  With  such  estimated 
savings,  the  M4IC  could  have  cost-effectively  afforded  the  equivalent  of 
twenty-six  IWTS  development  efforts.  A  conservative  figure  of  $250,000 
was  used  as  the  development  effort  cost  for  the  IWTS  and  the  resulting 
scale  factor  equation  was: 


$250K 

-  ~  2.5  %  Scale  Factor  for  Development  Effort 

$10  Million 

The  resulting  scale  factors  for  the  IWTS  versus  the  1WIC  are 
graphically  illustrated  in  Figure  3-37. 
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Fran  the  Doty  &  Associates  (RADC)  Studies: 


Please  Select  an  Application  Category: 

1  -  Utility  (OS) 

2  -  Ccrtmand  &  Control  (c2) 

3  -  Scientific 

4  -  Business 

5  -  All  (Others  not  listed  above) 

6  -  EXIT  (Return  to  master  CM  menu) 

Selection  (1-5)?  2 

Estimated  Deliverable  Source  LOC  (1,000's)?  80 
Please  input  a  yes/no  (Y/N)  response  to  eacn  of  these  14  questions: 


Special  display?  Y 

Detailed  definition  of  operational  req'mts?  Y 

Changing  operational  req'mts?  N 

Real  time  operation?  Y 

CRI  memory  constraint?  Y 

CFU  time  constraint?  Y 

First  S/W  developed  cn  CRJ?  Y 

Concurrent  development  of  ADP  H/W?  N 

Interactive  development  environment?  Y 

Off-site  development  computer  facilities?  N 

On-site  development  computer  facilities?  Y 

Development  computer  different  than  target  computer?  N 

Multi-site  development  computer  facilities?  N 

Unlimited  programmer  access  to  computer  facilities? N 

733.2  Person  Months  req'd  for  analysis,  design,  code,  debug,  test  and  checkout. 

(  Standard  error  on  this  approximation  =  41.1  %  ) 

Estimated  schedule  duration  =  18.1  Months 

Continue  Y Y  or  N)? 


Figure  3-36.  NMIC  Cost  Estimate  with  Benefit  of  a 
Scaled  System 
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Figure  3-37.  Scale  Factor  Determination 
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SECTION  4.  REMAINING  RESEARCH 

In  this  research  effort,  parameters  of  software  systems  that  are 
suitable  for  scaling  have  been  identified  and  metrics  have  been  defined 
for  them.  These  scale  factors  have  then  been  related  to  the  parameters 
of  the  operating  system  performance  simulator.  It  would  be  worthwile  and 
advantageous  to  further  develop  the  simulator  to  make  its  parameters  more 
sensitive  to  the  particular  requirements  of  IDHS,  i.e.,  refine  its  design 
to  be  less  general  purpose  and  more  IDHS-specific .  These  refinements 
would  aid  significantly  in  analyzing  the  particular  scaled  system  needs 
of  IDHS. 

In  addition,  a  complete  set  of  equations  relating  system  parameters 
to  scale  factors  would  be  of  great  value.  The  philosophy  of  this 
approach  and  initial  delineation  of  a  subset  of  the  simulator 
variable-scale  factor  equations  are  described  in  "Interrelationships 
Among  Scaling  Factors"  (Appendix  D),  and  "Simulator  Variable-Scale  Factor 
Equations"  (Appendix  E) .  Expressing  all  the  input  parameters  as  fairly 
simple  analytic  functions  of  the  scaling  factors  would  permit  more 
extensive  and  definitive  analysis  of  scale  factor  interrelationships,  in 
order  to  give  the  system  designer  a  better  tool  for  evaluating 
full-system  expectations  based  cn  those  of  the  scaled  system.  It  would 
also  enable  further  investigation  of  how  scale  factors  behave  in 
different  cperating  regions  to  be  performed  in  a  manner  which  would  help 
in  eliminating  the  uncertainties  of  the  interplay  of  more  than  one 
factor,  i.e.,  the  effects  of  combinations  of  parameters. 

As  illustrated  in  Figure  4-01,  the  enimeration  of  guidelines  for 
staling  factors,  together  with  the  derived  performance  relationships,  and 
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Figure  4-01.  The  Scaling  Handbook 


cost  modeling  results  would  be  used  to  produce  a  "scaling  handbook", 
which  would  be  of  great  value  in  the  design  of  I  CHS.  Further  work  should 
be  done  to  codify  the  results  of  this  research  to  produce  the  handbook, 
as  well  as  to  verify  the  efficacy  of  the  scaled  systems  methodology  by 
experimental  means. 
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Full-Scale  Effort  -  See  "Scaled  Effort" . 


Scaled  Effort  -  the  actual  or  projected  manpower- related  effort  required  to 
construct  a  system  to  a  certain  scale.  If  the  scale  is  100%  (or  1:1), 
then  the  corresponding  effort  required  to  construct  the  system  is  referred 
to  as  the  "full-scale  effort";  conversely,  if  the  scale  is  less  than  100% 
(or  l:n,  where  n  is  greater  than  one),  then  the  effort  is  "scaled"  in  the 
sense  that  it  is,  in  sane  measure,  less  than  the  corresponding  full-scale 
effort . 

Scaled  System  -  an  operational  system  that  differs  from  an  ultimate 
full-scale  system  in  magnitude  or  degree  of  functional  or  operational 
sophistication  and  that  can  be  quantitatively  related  to  that  system  by  a 
scale  percentage  rate,  fraction,  or  ratio  (ie.  "50%  scaled",  "1/2  scale", 
or  "scaled  1:2"). 

Scaled  System  Development  Effort  -  the  process  that  results  in  an  operational 
system  built  to  a  relative  scale  with  respect  to  a  target  system.  The 
purpose  of  a  scaled  system  is  to  serve  as  a  testbed  for  detecting  design 
deficiencies  in  the  target  system  so  the  necessary  changes  can  be  made  in 
the  front-end  of  the  development  cycle,  where  changes  are  less  costly  to 
effect  than  in  the  tail-end  of  the  development  cycle. 

Scaled  System  Development  Methodology  -  the  formalization  of  those  processes 
that  comprise  a  scaled  system  development  effort  and  provide  the 
theoretical  foundation  for  such  an  effort.  Hie  methodology  has  two 
principal  phases:  one  to  construct  and  evaluate  a  scaled  system,  and 
another  to  construct  the  desired  target  system. 

Un-Scale  -  See  "Up-Scale". 

Up-Scale  -  the  process  that  incorporates  evaluative  and  design  factors  of  a 
scaled  system  to  the  development  of  the  desired  full-scale  (target) 
system.  The  expended  or  projected  manpower  necessary  to  build  a 
full-scale  system,  given  a  scaled  system,  is  referred  to  as  the 
"Up-scaling  effort"  or  "Un-scaling  effort". 


Abstract 


showing  costs  of  various  scaled 
development  efforts  versus  unsealed 
A  study  has  been  conducted  efforts  for  projects  of  varying 

to  assess  and  quantify  the  cost  magnitudes. 

impacts  of  adopting  scaled  system  From  the  study,  a  need  was 

software  development  techniques.  identified  to  further  investigate 

First,  various  software  cost  the  subject  after  the  development 

estimation  relationships  and  models  of  cost  estimation  models  more 

appearing  in  the  open  literature  specifically  suited  to  the 

were  surveyed  and  evaluated.  The  evaluation  of  environmental  and 

results  of  the  evaluation  were  productivity  impacts  arising 

summarized  in  tables  ocmparmg  each  through  the  use  of  scaled  system 

model's  output  (in  terms  of  total  software  development  methodologies, 

developmental  effort)  given  inputs  There  is  also  a  need  to  develop 

(expressed  in  deliverable  source  quantitative  models  to  account  for 

lines  of  code  (DSLOC))  varying  over  the  benefits  of  scaling  additional 

a  range  of  system  sizes.  A  model  aspects  of  a  software  system,  such 

(by  Doty)  was  subsequently  chosen  as  complexity,  reliability,  and 

(due  to  its  consideration  of  data  base, 

environmental  factors)  to  evaluate 

cost  impacts  of  scaled  system  Background 

software  development  efforts  versus 

unsealed,  or  "  f  u  1 1- sea  led  "  ,  This  study  is  an  outgrowth  of 

development  efforts.  Preliminary  current  research  investigating 

results  indicate  that  substantial  software  cost  estimation 

cost  benefits  can  be  achieved  methodologies  and  scaled  systems 

though  the  use  of  scaled  systems  software  development  benefits, 

developmental  techniques.  These  Cost  estimation  is  an  integral 

results  are  presented  in  tables  factor  in  the  research  of  scaled 
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systems  because  cost  is  a  principal 
concern  (along  with  quality 
assurance  and  schedule/risk 
minimization)  in  the  contemplation 
of  scaled  systems  development 
techniques . 

Scaled  system  methodology 
partly  consists  of  implementing  and 
delivering  an  operational,  or 
semi-operational,  software  system 
"to  scale".  Such  a  system  probably 
would  not  support  all  of  the 
operational  and  structural 
characteristics  of  the  desired 
system  but  would  serve  as  a 
skeletal  testbed  for  the  system's 
engineers  and  customer  personnel  to 
evaluate  the  functional, 
operational,  and  performance 
characteristics  targeted  for  the 
ultimate  product  —  a  "full-scale" 
system.  Conceptually,  scaled 
systems  are  similar  to  the  scaled 
prototype  models  used  by  product 
designers  and  engineers  in  the 
shipping,  aircraft,  and  automobile 
industries  involved  in  the 
development  of  competitive, 
reliable,  and  quality  products. 


The  potential  benefits  of 
applying  scaled  system  technology 
to  a  software  development  project 
are  many.  First,  customer 
functional  and  operational 
requirements  may  be  refined  and/or 
solidified  through  the  benefit  of  a 
"hands-on"  evaluation  of  the  scaled 
prototype  system.  Second,  in 
pursuit  of  performance  or 
reliability  increases  or  to  verify 
that  full-scale  system  performance 
will  be  within  specifications,  the 
target  system  design  may  be 
optimized.  Evaluation  of  the 
scaled  prototype  may  reveal 
potential  efficiencies  or  economies 
in  the  system's  architectural 
structure  and  its  real-time 
management  of  on-line  resources. 
Third,  uncertainty  about  system 
technological  feasibility, 
architectural  soundness,  or 
reliability  may  be  reduced  or 
eliminated  through  the  experience 
provided  by  the  scaled  effort. 
Rather  than  being  a  costly  (and 
wasted)  by-product,  experience 
gained  in  the  scaled  effort  is 
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economically  realized  through  the 
increased  productivity  of  system 
designers  and  programmers  in  the 
effort  expended  to  construct  the 
ultimate  target  system  —  the 
"unsealing"  effort.  In  this  way, 
experience  is  capitalized  upon 
through  feed-back.  Additionally, 
experience  aids  project  managers  in 
their  decision  process  by  reducing 
the  uncertainty  concerning  the 
f emulation  of  the  schedule  for  the 
unsealing  effort. 

To  some  degree,  the 
attractiveness  of  scaled  systems 
implementation  techniques  is 
intuitive.  The  thrust  of  this 
study  is  to  justify  the  scaled 
sytems  approach  based  upon 
quantitative  prediction  of  cost  or 
schedule  savings. 

Discussion 

At  the  onset,  this  research 
started  with  a  survey  of  current 
literature  in  search  of  software 
engineering  predictive  models.  A 
complete  list  of  titles  is  included 


in  the  bibliography.  Major  titles 
included  "Workshop  on  Quantitative 
Software  Models"  published  by  the 
IEEE,  "Quantitative  Software 
Models"  by  EACS,  and  "Elements  of 
Software  Science"  by  Maurice 
Halstead.  The  models  of  Walston 
and  Felix  (IBM),  Putnam,  Doty,  and 
the  System  Development  Corporation 
were  selected  to  be  encoded  into 
BASIC  for  execution  on  a  RP-8000 
microcomputer  system.  These  were 
selected  primarily  for  their 
simplicity;  they  are  all 
regression-derived  equations  of  the 
forms  effort  =  constant  times 
nuttier  of  instructions  raised  to  an 
exponent  (except  for  an  alternate 
form  offered  by  Doty  which  computes 
an  additional  adjustment  factor 
based  upon  fourteen  environmental 
characteristics).  Additionally, 
two  theoretical  approaches  derived 
by  Maurice  Halstead  to 
quantitatively  estimate  program 
effort  were  implemented .  The  first 
appears  in  chapter  eight  of  his 
book  under  the  subtitle  of  "Timing 
Equation  Approximations"  [3], 
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while  the  other  appears  in  a 
reader '  s  response  to  Walston  and 
Felix’s  article  "A  Method  of 
Programming  Measurement  and 
Estimation"  [9J,  submitted  by 
Professor  Halstead. 

One  objective  of  scaled  system 
technology  is  to  reduce  the 
uncertainty  of  software 
development.  This  uncertainty  is 
often  accounted  for  in  one  way  or 
another  by  current  models  through 
"environmental"  factors  or 
constraints.  Examples  of  these 
factors  include  Doty's  attention  to 
the  existence  of  a  detailed 
operational  definition  or  the 
presence  of  changing  customer 
requirements  and  Price-S’s  \1 
"Complexity"  input  parameter. 

Another  objective  of  scaled 

system  technology  is  to  take 

advantage  of  increased  progranmer 

productivity  during  the  unsealing 

stage  of  the  software  development 

cycle.  Progranmer  inexperience  has 

been  shown  to  represent  a 

considerable  ’ nrden  cn  a  software 

development  effort  over  the  entire 

1  Price-S  is  a  proprietary  paranetr 
package  invented  by,  and  available  f< 
Cherry  Hill,  New  Jersey. 


development  schedule.  This  burden 
may  be  reduced  or  eliminated  during 
the  unsealing  effort  as  a  result  of 
programmer  experience  gained 
through  the  scaled  effort,  yielding 
considerable  positive  cost  and 
schedule  impacts. 

At  this  point,  we  can  begin  to 
formulate  an  approach  to  estimating 
scaled  system  savings.  This 
approach  examines  the  sensitivity 
of  cost  (effort)  to  environmental 
and  productivity  considerations. 

Approach 

As  mentioned,  six  models  for 

programming  effort  were  selected 

for  the  study;  four  derived  from 

regression  analysis  and  two  from  an 

interpretation  of  the  natural  laws 

governing  human  preparation  of 

computer  programs  (Halstead). 

Certainly,  other  models  exist  and 

more  are  currently  under 

development;  these  were,  hewever, 

the  most  accessible  for  quick 

implementation.  Although  accuracy 

was  not  a  primary  consideration 

software  development  cost  modeling 
lease  from,  the  RCA  Corporation  of 
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(the  results  were  not  explicitly 
intended  to  be  used  for  a  cost 
proposal),  representation  of  the 
basic  relationships  existing 
between  program  size  and  effort 
over  a  range  of  program  sizes  was 
seen  to  be  crucial  to  a  trade-off 
analysis  of  scaled  systems 
technology;  therefore,  a  comparison 
study  was  in  order. 

Various  program  sizes  ranging 
from  one  thousand  to  one  million 
source  lines  of  code  were 
systematically  selected  and  input 
into  the  cost  models  and  their 
corresponding  outputs  were 
recorded.  The  results  appear  in 
Table  1 .  The  last  column  records 
the  mean  values  of  the  estimates, 
their  standard  deviations,  and  a 
disparity  factor  which  was  computed 
by  dividing  the  standard  deviations 
by  the  means.  Because  the  Halstead 
model  appeared  to  be  quite  unstable 
over  such  a  wide  range  of  system 
sizes,  its  results  were  not 
included  in  the  correlation 
computations;  only  the  four 
regression  models'  outputs  were 


used  in  these  computations. 
Disparity  factors  for  the  various 
model's  outputs  ranged  from  4%  to 
46%,  varying  directly  with  program 
size.  For  program  sizes  less  than 
100k  source  lines,  the  disparity 
factor  did  not  exceed  30%.  These 
figures  show  that,  for  systems 
smaller  than  100k,  the  models  are 
in  relatively  close  agreement. 

Et>r  this  ocmparison  study,  the 
models  were  to  be  run  without 
regard  to  application.  This 
assumption  particularly  impacted 
the  Doty  model,  which  offers  the 
optical  of  selecting  one  of  several 
different  application  categories. 
For  a  description  of  the  Doty 
model's  quantitative  parametrics, 
see  Exhibit  1 .  Hie  figures  for  the 
Doty  model  in  Table  1  result  frcm 
the  selection  of  the  "All" 
application  category.  Table  la  is 
provided  as  additional  information. 
It  shcvs  the  results  uhen  the  Doty 
model  is  run  with  the  selection  of 
the  "Command  and  Control" 
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application  category,  the  proper 
application  selection  for  software 


r 


projects  in  the  DoD  environment. 
Note,  however,  that  the  disparity 
factors  are  greater  in  'Table  la  due 
to  the  use  of  this  application 
category  in  the  Doty  model. 

Through  the  facility  of 
straightforward  input-output  models 
such  as  those  described  which  yield 
estimated  effort  given  projected 
system  size,  only  a  limited 
approach  to  analyze  scaled  system 
trade-offs  may  be  formulated.  Such 
an  approach  is  summarized  in  Table 
2,  using  the  Doty  model  in  the 
"All"  application  category  (Table 
2a  shows  the  results  of  the 
"Command  and  Control"  application 
category).  Given  an  estimated 
full-scale  system  and  its 
associated  predicted  effort 
measure,  the  efforts  required  for 
implementing  the  system  scaled  frcm 
10%  to  90%  were  computed.  The  next 
step  was  to  estimate  the  effort  for 
the  unsealing  effort.  In  the 
absence  of  environmental  and 
productivity  computational  factors, 
the  estimated  unsealing  effort 
would  equal  the  full-scale 


estimated  effort  and,  hence,  no 
cost  savings  would  be  reflected  in 
the  analysis.  Certain  savings 
factors  would  therefore  have  to  be 
assumed.  The  method  chosen  for 
Table  2  was  to  reflect  unsealing 
effort  savings  through  a  reduction 
in  projected  deliverable  source 
lines  of  code.  This  assumption 
appears  to  be  a  valid 
representation  of  the  anticipated 
increased  programmer  productivity 
occurring  during  the  unsealing 
effort.  Reducing  the  number  of 
instructions  shortens  schedule  much 
as  if  programmer  productivity  were 
increased.  If  sought-after 
technical  design  economies  are 
achieved,  they  too  could  be 
reflected  by  a  reduction  in 
delivered  source  code  for  the 
unsealing  effort.  Additionally, 
with  the  existence  of  an  iterative 
enhancement  development  approach 
[7],  some  of  the  design  and  code 
for  the  unsealing  effort  would  lx* 
completed  prior  to  the  start  of  the 
unsealing  effort  again  result! no  m 
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effort  and  schedule  savinos. 


Assuming  the  validity  of  these 
assumptions,  Table  2  shows  matrices 
cross-listing  scaled  effort  with 
net  unsealing  savings  expressed  as 
a  reduction  in  total  deliverable 
source  statements.  As  the  table 
shows,  scaled  system  savings  for 
any  system  size  result  when 
unsealed  effort  savings  equal  or 
exceed  the  scaled  system  factor 
used  during  the  scaled  effort.  For 
example,  given  a  scaled  system  of 
factor  30  (30%  of  the  total 

anticipated  system  size),  the  table 
predicts  that  total  project  savings 
will  result  if  the  system  is  scaled 
and  at  least  30%  savings  can  be 
realized  during  the  unsealing 
effort  (this  is  only  slightly 
different  in  the  "Command  and 
Control"  table,  Table  2a). 

Fortunately,  and  for  the  sake 
of  better  estimates  of  scaled 
system  savings,  models  which 
incorporate  environmental  and 
productivity  factors  into  their 
computations  are  available.  One 
such  model  is  a  variation  supplied 
by.  Doty  Associates.  As  with  the 


other  Doty  models,  this  model 
offers  five  application  categories: 
utility,  command  and  control, 
scientific,  business,  and  "all". 
In  addition,  fourteen  environmental 
factors  are  accounted  for  in  the 
computation  of  projected  total 
effort.  These  fourteen  factors  are 
listed  in  the  literature 
(reproduced  in  Exhibit  2)  and  also 
in  the  Doty  model ’ s  screen  display 
shown  in  Exhibit  3.  Of  these 
fourteen  factors,  two  were  deemed 
most  relevant  to  an  analysis  of 
scaled  systems  savings?  these  are 
termed  "detailed  defintion  of 
operational  requirements"  and 
"changing  operational 
requi rements " . 

In  the  development  of  modem 
software  systems,  more  often  than 
not  a  detailed  definition  of  the 
operational  requirements  is 
lacking;  therefore,  the  environment 
is  one  of  changing  operational 
requirements.  This  phenomenon  has 
been  attributed  to  many  reasons, 
but  the  difficulties  customer  and 
systems  personnel  encounter  When 
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attempting  to  ocrmunicate  a  system 
operational  speci f ication  are 
probably  paramount .  Oie  objective 
of  the  scaled  system  development 
approach  is  to  aid  these  personnel 
in  arriving  at  the  system's 
operational  specification,  and  to 
allow  them  to  economically  modify 
or  enhance  it,  through  the  benefit 
of  a  scaled  prototype  system  which 
they  may  evaluate. 

Tables  3  and  3a  i Illustrate 
the  impact  of  these  environmental 
factors  upon  total  estimated 
effort,  given  a  discrete  system 
size.  The  number  in  the  column 
labeled  "Estimated  Full-Scale 
Effort"  gives  the  estimated 
development  effort  in  the  absence 
of  a  detailed  operational 
specification  and  in  the 
environment  of  changing 
requirements  (see  Exhibit  3  for  the 
environmental  responses  input  to 
the  model  to  arrive  at  these 
figures).  For  the  figure  appearing 
in  the  column  entitled  "Unsealing 
Effort",  these  constraints  were 
removed  (Exhibit  3a  lists  the 


responses  used  to  characterize  the 
unsealing  environment ) .  This 
figure  represents  the  additional 
effort  necessary  to  construct  the 
full-scale  system,  having  completed 
the  scaled  system  implementation 
and  evaluation.  The  difference  in 
the  two  figures  is  the  amount  of 
effort  which  is  economically 
available  for  the  scaled 
development  effort.  Estimated 
efforts  based  upon  the  various 
scale  factors  are  also  listed.  In 
arriving  at  these  figures  for  the 
scaled  efforts,  the  constraints  of 
no  detailed  operational 
specifications  and  changing 
requirements  were  included  in  the 
analysis  (see  Exhibit  3).  Another 
constant  relationship  was  found 
regardless  of  system  size  --  a 
scaled  effort  of  factor  10  (40  for 
the  "Command  and  Control" 
application  category)  was  necessary 
for  any  significant  effort  savings 
to  be  realized.  The  saved  effort, 
however,  was  significant.  For 
example,  in  the  100k  case,  the 
model  showed  that  savings  of 
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Conclusions 


approximately  18  person  months 
could  be  realized  if  a  1/10  scaled 
system  oould  be  implemented  and  a 
detailed  operational  specification 
developed  as  a  result.  Assuming  a 
cost  of  $5000  per  person  month, 
this  savings  translates  to  a  total 
of  nearly  $90,000.  Exhibit  4 
graphically  portrays  the 
relationships  expressed  in  Table  3. 
Of  significance  is  the  fact  that 
the  data  of  Table  3  assumes  no 
productivity  changes  between  the 
full-scale  and  scaled  approaches, 
which,  if  present,  could  even  more 
dramatically  increase  developmental 
savings.  The  "Ccrmand  and  Control" 
application  category  of  the  Doty 
model  predicted  a  higher  break-even 
scale  factor  point  than  the  "All" 
category  (see  Exhibit  4a).  This 
can  be  attributed  to  the  greater 
exponent  found  in  the  effort 
algorithm  and  the  different  weights 
offered  for  the  environmental 
factors.  Exibit  5  offers  a 
generalized  portrayal  of  a  scaled 
system  oreak-even  cost/benefit 
an^lvsis. 


The  limited  study  done  here 
with  the  aid  of  simplistic  ooet 
estimation  models  alludes  to 
significant  cost  savings  resulting 
from  the  use  of  scaled  system 
development  methodologies.  None  of 
the  analytical  approaches  presented 
here,  however,  account  for 
beneficial  productivity  changes 
anticipated  for  unsealing  efforts. 
Fran  Exhibit  6,  the  data  of  Walston 
and  Felix  of  IBM  project  50-180% 
productivity  increases  based  upon 
progranmer  experience.  Future  work 
in  the  areas  of  software 
engineering  and  cost  modeling  with 
attention  to  cost  and  schedule 
drivers  and  productivity  factors 
will  benefit  system  architects  both 
in  schedule  estimation  and  scaled 
system  methodology  analysis. 
Research  with  more  accepted 
parametric  models  such  as  Price-S 
might  lead  to  greater  insight  into 
the  potential  cost  benefits  derived 
from  the  use  of  scaled  system 
technology.  Attention  should  be 
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paid,  however,  to  possible 
parametric  impacts  of  such  factors 
as  complexity,  reliability,  and 
data  base  which  may  require  the 
development  and  use  of  additional 
parametric  relationships. 
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EXHibiT  3 


From  the  Coty  i  Associates  (RADCl  Studies: 


Plfflse  Select  an  Application  Category: 

1  -  Utility  (OS) 

2  -  Comas  nd  i  Control  ( c2  5 

3  -  Scientific 

4  -  Business 

9  -  All  (Others  not  listed  aOcvet 


Selection  (1-S)? 

Estimated  Deliverable  Source  LOC  !  1,000'*)?  <.  fr»*«  teslea  > 
(S)cale,  (U)pscale,  or  (OSption?  C 


Please  input  a  ves/no  (Y/N)  cespcnse  to  each  of  these  14  questions: 

Res^o  n  s  e s 


Special  display? 

Cetailea  definition  cf  cperatiorai  rec'mts? 

Change  to  opera  cicrai  rec'mts? 

Bol  time  operation? 

CPC  memory  constraint? 

CPC  time  constraint? 

First  S/V»  ceveicced  on  CPC? 

Concurrent  development  of  AC?  S/S'? 

?ure  snare.  vis~a-vis  Catch  processing,  m  cev'irent? 
Cff-site  development  computer  facilities? 

Cr.-site  development  computer  facilities? 

Development  car.puter  different  than  tacget  darputer? 
.•*ulti-site  development  computer  facilities? 
unlimited  programmer  access  to  computer  facilities? 


C?:.i 


9999.99  wn  honors  req'd  for  analysis,  design,  code,  cecuc,  test  and  cheocc 
i  SCar.cfe.rd  error  or,  this  a ccrcxtua  ticn  •  99.9  i  ) 


Estimated  seneduie  duration  *  999.99  vonths 


Continue  (Y  or  M)  ? 


Environmental  Responses  for 
Full-scale  and  Scaled  Efforts 
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Exhibit  3a 


Fca*  as*  Coty  t  Associates  (PACO  Studies: 


Please  Select  in  Application  Catsqory: 

1  -  Utility  (0S1 

2  -  Camand  i  Control  (c2) 

3  -  Scientific 

1  -  Business 

S  -  Ail  (Ctr.ers  not  listed  icove) 

Selection  (1-9)? 

Saturated  Celiveracle  Source  ICC  (l.finfl'sl?  <  ■fram  tualcs 
(S)cale,  (Opscaie,  or  (O)peion?  C 


Please  Input  a  yes/no  (Y/N)  response  to  «cr.  of  tr.ese  14  questions: 


R 

Special  display? 

Cecaiied  definition  cf  opera  tioral  cea'sts? 

Charge  to  operational  cec’rts? 

Real  tiae  operation? 

CPC  oerory  constraint? 

CPC  tune  constraint? 

First  S/Vi  developed  or.  CPC? 

Concurrent  develcptent  of  ACP  3/V? 

Time  stare,  viw-vvs  Cater,  processing,  in  dev'stent? 
Cff-site  development  car.cuter  facilities? 

Cn-site  development  computer  acilities? 

Cevelopxent  computer  different  titan  target  computer? 
fetid-site  development  computer  facilities? 

Cnlunited  pcogcamner  access  to  ccsputsr  ^ciiities? 


esponses ; 

N« 

Y*S 
Ho 
H  o 
Ho 
Mo 
Ho 
H  o 
Yes 
Ho 
r«s 
H  o 
H  o 

Yes 


9999.99  J4an  rands  rac'd  for  analysis.  design,  cede,  cecuc,  test  sr.c  rr.scraut. 
(  Standard  error  or.  iris  apcroxuraticn  *  99.9  9  ) 


Estumted  sen edule  duration  »  999.99  Montns 
Continue  (Y  or  N)  ? 


Environmental  Responses  for 
Up-scaling  Efforts 
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Table  3 


Sue 
( 1.000}) 
V,  DSlocJ 

Esti  nta. laJ 
full  Scale 

Effo  rf 

(P*rsom-f*U.*ks') 

f  r«r‘* 
Scale 
F*c-*cr 

1  Et  ro  r-i 

ScaJaA 

ErAri. 

UnscaJiAj 

E ffor-t 

T o  to.  1 

Effort 

usi-tk 

Sca.1  ir<c 

Cf  nil  3 

|  -ScWt/iq  <T 
■X 

SSooo  / 

I 

1 

2.77 

3o  Z 
60 

70 

So 

SO 

AO 

30 

2o 

Z  .  48 

Z.  IS 

1.  si 

I  .  61 

1  ■  34 

1  .  06 
.  7S 
■  51 

2 .38 

4  .  86 

4  .57 

4  .  28 

4  .  00 

3  •  72 

3  .44 
3.17 

2 . 89 

i 

1 

j 

1 

! 

io 

.  2  S 

1.48 

&  145  o 

10 

30.  S  7 

307. 

So 

70 

60 

50 

AO 

30 

2o 

17.  64 
24.43 
2.1. IS 
19-08 
14-34 

L  1.  S3 
3-75 
5,71  . 

26.48 

5  4  .11  j 
SO  .51  1 
4  7 .73  i 
4  4  56  i 
4  1  .42  i 
38.31  | 
3  5"  2.3  i 
32-  2o  1 

&  glo  o 

10 

2.77 

2  9.  2 S'  ! 

100 

343. 93 

307. 

80 

70 

60 

50 

AO 

3  o 

2o 

306.01 
272.28 
2  36.75 
201  -46 
16S-4S 
13  1-77 

9  7  50 
6  5.78 

295.03 

603.10  j 
567-37 
531-84 
496  55 

4  6 1-54 
416.  86 
39Z  59 
358-87 

1 

j 

1 

1 

10 

30.87 

3  2  5.  9  6 

^  SO, 85 o 

1000 

383Z.41 

90 7. 

eo 

70 

60 

So 

40 

3c 

2o 

3432.13 

3 033.94 
1(38-09 
1144.9° 
18S4.79 
1468.35 
1086-4-7 
710.  64 

3288.22 

6110  35 
63  22.  16 
592531 
5533-12. 
5143.01 
4156  57 

4314.69 
3998  86 

] $  1,0°! >5°  c 

10 

343.93 

3632  l 

-25 


Ezt  i  mfl 

full  ScJ. 
Effort 
\(P  er  j  f*L.  «0 


S«,W  £7 


T*>r*£ 


Unsc.a.1  in'. 
E ffort 


I  o  -t  a  I 

Effort 


Scflie 

F*c+«r 

Sc*J+4 

EFfi>-rt 

i  witk 

Sc  a.!  mo 
"mr 

30^ 

193.0  1 

■  37  7. 57  1 

80 

176.64 

-355  53  1 

70 

148 . 33 

333.32  ! 

So 

1  26-  26 

"O  *  5  0’ 

|  ^  ^  V  • 

SO 

1 04-32 

1  84'.  S  4  ’  £gs  .  2.  g  . 

40 

3  2.-  fa 

i  367-54 

30 

61-12 

i  2.45  06  ; 

2o 

33  97 

I  224.  SI  i 

10 

19-34 

1  2.04.  28  i 

Jjovfoj y 
&£ooo  / P  [A 


21 5  55 


ise,%z° 


320.23 

307: 

80 

70 

60 

50 

40 

30 

OO 

82.4.2  2 
723.51 
633.46 
533-04 
445-37 
352.58 
260-88 

1  70-64_ 

789  « 

1613.68 ; 
1518  07 

1423.02 
1328. 60 
1234.33 
1142.14 

1  050.44 
560.20 

A  /I  .  /  / 

■33H 

04.58  1 

872.24  ~ 

7?  24o,4S° 

1501.42 

9  074 
80 

70 

60 

So 

40 

30 

20 

1702.82. 

1505.21 

1308.87 

1113.73 

920.24 

728.51 

539.04 

352.38 

1631.72 

3334.24 

3136.63 
2940.29 
2145 .2  1 
2 551. 66 

2355.5  3 
2 1 "o.4S 
1984 .00 

1  70.64 

1202-06 

^  49S,8oo 

2907.0 

907. 

60 

70 

60 

So 

40 

30 

2o 

2603.38 
22 01 . 34 
2001.07 

2  702-82 
1406-91 
1113-19 

824-12 
539-0  4 

249722 

5031 , 59 
47  9  5.55 
4455'.  48 
4157 , 03 
3502.  22. 

3  600.  0  0 

33  18.33 

3  03  3.45" 

10 

260  88 

2  7  55.4s  ” 

7 59  ,S& 

768  41 20 


512  2469 


2561083 


64 


175 


Exhibit  4 


'ALL' 


Estimates  with  considerations  of  quality 
of  functional  requiranents  definition  and 
environment  of  changing  operational 
requirements  reveal  significant  cost 
savings  can  result  fran  the  use  of 
scaled  system  development  techniques. 


Exhibit  4a 


Estimates  with  considerations  of  quality  of  functional 
requirerrents  definition  and  environment  of  changing 
operational  requirements  reveal  significant  cost  savings 
can  result  from  the  use  of  scaled  system  developement 
techniques. 


Exhibit  5 


which  promote 
savings 

%  Scale  Factor 


Notes : 

1.  "N"  represents  the  expected  effort  to  design  and  inplement 
the  full-scale  version  of  a  proposed  system. 

2.  The  curve  represents  the  total  cost  of  a  scaled  system  effort; 
that  is,  the  cost  of  the  scaled  system  effort  plus  the  cost  of 
the  up-scaling  system  effort. 


Exhibit  6 


Impact  of  Experience  on 
Programmer  Productivity 

source:  Walston  &  Felix 

(IBM) 


Without 

Experience 


With 

Experience 


146 

D61/FM 


Sane  -  221  DSL/FM 
Extensive  -  410  DSL/FM 


DSL  =  Delivered  Source  Lines 
FM  =  Person  -  Months 


% 

Increase 

+  51% 
+180% 
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A.  ABSTRACT 

This  paper  summarizes  the  research  leading  to  and  involved  in  the 
development  of  a  list  of  software  scale  parameters.  Software  scale 
parameters  are  those  aspects  of  automated  systems  that  can  be  reduced  in 
scope  in  order  to  implement  a  cost-effective  systan  scaled  with  respect 
to  the  full-scale  system  objective. 

The  work  of  Yourdon,  Tausworthe,  Dijkstra,  Knuth,  Pamas,  Mills, 
Belady,  Lehman,  Basili,  Tinker,  Preiser,  Halstead,  House,  Musa,  Turner, 
and  others  have  been  studied  in  an  attempt  to  isolate  various  elements  of 
software  systems  most  suitable  to  scaling.  Each  scale  parameter  is 
defined  in  detail,  with  sufficient  background  to  introduce  the  area. 
Areas  for  consideration  include  functionality,  data  base  characteristics, 
maintainability,  security,  reliability,  performance,  language  and 
configuration.  These  areas  will  form  the  foundation  for  later  work  under 
the  small  scale  system  design  effort. 

B.  OBJECTIVE 

The  objective  of  this  report  is  to  identify  parameters  of  software 
systems  subject  to  scaling  and  to  begin  a  definition  of  scale  factors 
associated  with  each.  These  scale  factors  will  be  used  to  develop 
appropriate  metrics  to  standardize,  quantify,  and  objectively  describe  a 
scaled  system  in  terms  of  the  full-scaled  system  it  represents. 

C.  DISCUSSION 

An  analysis  of  current  software  systems  development  methodologies 
has  been  conducted  to  isolate  the  elements  most  suitable  to  scaling. 
Considerable  research  has  been  performed  to  identify  other  work  in  the 
software  engineering  field  that  would  be  applicable  to  the  scaled  systems 
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project. . 

It  is  particularly  important  to  identify  the  point  in  the  systan 
development  cycle  at  vdiich  it  is  appropriate  to  anploy  a  scaled  system. 
A  major  constraint  to  this  decision  is  the  availability  of  sufficient 
information  on  which  to  base  scale  factor  decisions.  A  further 
limitation  is  imposed  by  the  need  to  define  requirements  to  the  total 
system  level  before  scaling  to  a  whole  system  reference  is  possible. 

Current  software  system  development,  methodologies  emphasize  a 
process  in  vhich  development  is  conceived  as  proceeding  through  a  series 
of  phases.  Each  phase  is  organized  to  complete  a  specific  planned 
process  and  produces  output  in  terms  of  information  or  design  dociments 
which,  in  turn,  is  input  to  the  next  phase.  Referring  to  the  DoD 
lifecycle  description,  this  process  begins  with  the  initiation  phase  and 
progresses  through  the  development,  evaluation  and  operation  phases. 
Most  attempts  to  improve  the  efficiency  of  the  development  cycle  have 
concentrated  on  improving  the  processes  vhich  comprise  same  single  phase. 

Thus  structured  programming  focuses  on  the  programuing  stage  of  the 
development  phase  v»hile  composite  design  applies  to  the  design  stage  of 
the  development  phase.  The  scaled  system  approach,  as  it  is  envisioned, 
bridges  the  gap  between  the  definition  and  design  stages  of  the 
development  phase. 

To  further  clarify  this  conclusion,  consider  the  activities  that 
make  up  the  definition  stage.  Robert  Tausworthe  in  Standard  zed 
Development  of  Ccrputer  Software  calls  this  the  program  definition  or 
functional  specification  phase,  vdiich  he  divides  into  two  activities; 
that  of  creating  the  software  requirement  'Figure  1)  and  the  "software 


SOFTWARE 
REQUIREMENT 
document 


Figure  1.  The  Software  Requirement 


definition  (Figure  2).  As  Tausworthe  explains  it,  the  creation  of  the 
software  requirement  further  consists  of  two  parts,  both  largely 
non- technical ,  to  conceptually  lay  out  the  requirement.  The  first  part, 
-that  of  planning  information,  establishes  the  requirement  for  the 
software.  The  second  part,  the  user  requirements,  establishes  the 
requirements  of  the  software. 

Following  the  conceptual  activity  of  software  requirement  creation 
comas  the  functional  definition  of  the  software.  This  is  a  technical 
activity  which,  when  complete,  defines  both  what  the  software  is  to  do 
(not  how  it  is  to  do  it)  and  the  meaning  of  program  correctness . 

The  requirements  and  definition  activities  are  an  iterative  cycle. 
Concurrent  interaction  between  requirements,  definition  and  approved 
amendments  is  a  necessary  activity  to  achieve  a  final  balance  betwen 
software  requirements  and  feasible  system  definition  before  the  detailed 
design  process  begins.  It  is  at  this  stage  that  the  definition  criteria 
may  be  applied  to  the  development  of  scale  factors  and  the  preliminary 
requirements  to  scaling  established.  Attempts  to  define  scale  factors 
earlier  in  the  process  will  suffer  from  insufficient  data.  Factor 
definition  at  a  later  point  will  be  constrained  by  the  progress  in 
detailed  design.  It  should  be  reiterated  that  definition  to  the  total 
system  level  is  necessary  in  order  to  provide  the  total  system  baseline 
to  vhiich  one  must  scale. 

D.  TECHNICAL  APPROACH 

The  technical  approach  to  the  process  of  deriving  scaled  parameters 
from  various  elements  of  the  software  definition  is  addressed  in  the 
remainder  of  this  report.  The  software  elements  applicable  to  the 


B-6 


B-7 


F/G  9/2 


AD-A110  867 

UNCLASSIFIED 


INCO  INC  MCLEAN  VA 
SMALL  SCALE  SYSTEMS. (U) 

SEP  81  M  KERCHNER •  P  SRIMES  F30602-80-C-0219 

INC0/1155-681-TR-46-D(F)  RADC-TR-G1-251  NL 


scaling  of  data  base,  performance,  functionality,  security, 
maintainability,  reliability,  language  and  configuration  are  described 
and  defined  in  detail  with  examples.  It  should  be  noted  that  there  are 
-two  components  of  scaling  to  be  considered.  The  first  is  the 
identification  of  the  scaling  parameters  themselves  such  as  size, 
modularity,  etc.,  and  the  second  component  is  the  measured  effect  on 
items  such  as  the  throughput  and  utilization  of  the  total  system  itself. 

1.  Data  Base  Characteristics 

A  data  base  is  a  collection  of  data  records  between  vhich 
specific  relationships  exist.  These  relationships  may  be  used  to  link 
record  types  and  records  of  the  same  type.  A  record  is  ar  aggregate  of 
data  transcribed,  or  in  a  form  suitable  for  transcription,  between  a 
computer  and  an  external  medium  Each  record  comprises  data  (normally 
called  fields)  that  have  an  underlying  relationship  to  one  another.  Data 
elements  in  a  record  may  be  of  similar  or  dissimilar  types;  bits, 
nunbers,  character  strings,  etc.  Records  of  the  same  type  are  usually 
grouped  into  larger  aggregates  called  files.  In  practice,  a  large  file 
may  contain  hundreds  or  thousands  of  blocks,  each  containing  one  or  more 
records. 

Data  base  scale  parameters  derive  frcrn  two  areas.  The  first 
area  concerns  the  complexity  of  the  access  system  and  the  second  concerns 
the  various  size  elements  involved  at  each  level  of  the  data  base 
structure. 

a.  Data  Base  Conplexity 

There  cure  at  present  three  major  data  base  access  methods 
that  must  be  examined  for  scaling  purposes.  They  are,  in  increasing 
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order  of  complexity: 

1)  Sequential  Access 

The  term  sequential  access  is  used  vfrien  the  access  to 
.records  is  through  a  key  by  Vihich  the  file  is  physically  sequenced. 
Access  is  therefore  serial,  i.e.,  each  item  must  be  examined  in  sequence 
until  a  key  match  is  found. 

2 )  Indexed  Sequential 

The  indexed  sequential  access  method  (ISAM)  refers  to 
a  setup  vhereby  an  index  table  is  established  through  which  record  access 
is  made.  In  ISAM,  one  or  more  items  in  each  record  is  chosen  as  the 
"key".  The  index  then  consists  of  an  ordered  sequence  of  the  values  of 
the  ISAM  key  which  occur  in  the  collection  of  records  that  compose  the 
data  base.  Associated  with  each  value  is  an  address  or  pointer  to  the 
record.  The  file  is  stored  in  seme  kind  of  direct  access  storage  such  as 
disk  or  drum  so  that  once  an  address  is  retrieved  from  the  table,  the 
associated  record  may  be  accessed  directly.  Note  that  although  the  record 
may  be  accessed  directly,  the  process  of  finding  the  pointer  to  the 
record  still  involves  a  sequential  (or  perhaps  binary,  if  the  table  is 
ordered)  search  of  the  index  table.  The  advantage  of  ISAM  is  that  once 
the  address  is  determined,  all  data  nay  be  accessed  with  equal  ease. 
ISAM  is  often  used  for  very  long  files  containing  thousands  of  records. 
Table  search  time  is  considerably  less  than  the  time  required  to  search 
through  each  record. 

3 )  Random  Access 

For  this  access  method,  no  index  table  is  maintained. 

Instead,  to  decide  vfriere  in  storage  a  record  may  be  accessed,  the  value 
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of  the  key  is  used  as  input  to  same  hashing  algorithm  which  is  designed 
to  produce  as  output  a  storage  address.  This  address  is  not  necessarily 
a  physical  address  in  the  sense  of  stipulating  exactly  vhich  physical 
-location  in  direct  access  storage  will  be  used,  but  may  be  a  logical 
address  within  seme  area. 

In  the  case  where  the  data  base  is  organized  in 
hierarchical  form,  that  is  in  applications  vrtrere  a  natural  hierarchy  of 
relationships  exist  between  data  items,  and  any  data  subset  is  contained 
entirely  within  its  superset,  another  form  of  access  may  be  used.  The 
root,  or  parent,  record  may  be  located  by  either  a  hashing  method  or  by 
sequential  search  of  a  table.  Subsequent  records  "lower”  in  the 
hierarchy  may  then  be  accessed  through  direct  access  pointers.  Direct 
access  pointers  may  also  be  used  in  the  network  model  in  which  the  data 
structure  sets  serve  as  the  logical  links  between  records  of  different 
types  and  reflect  the  data  organization  rather  than  an  exact 
representation  of  entities.  The  data  structure  may  be  quite  ccmplex  in 
that  one  record  may  be  linked  with  any  other  and  have  any  nixnber  of 
superiors  or  subordinates. 

Another  form  of  direct  access  occurs  with  the 
relational  data  base.  In  this  model,  largely  experimental,  data  are 
organized  into  tables  (relations)  each  of  which  may  be  directly  accessed 
through  the  table  name.  Row  and  aolimn  ordering  has  no  significance  and 
each  oolum  (or  domain)  may  be  directly  accessed. 

Data  base  complexity  may  be  scaled  by  first  employing 
the  simplest  access  method  (sequential)  to  model  a  data  base  and  then 
developing  the  scaling  relationship  involved  in  increasing  the  complexity 
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from  sequential  to  indexed  sequential  to  randan  access.  At  another 
level,  it  is  also  possible  to  scale  oanplexity  by  restricting  to  single 
access  a  system  vAiich  in  full  scale  would  ad  lew  multiple  access  to  the 
-data  base. 

The  scaling  step  fran  a  sequential  data  base  to  an 
indexed  sequential  is  straightforward.  Quantitative  measurement  of 
results  of  this  scaling  step  is  of  course  based  on  size  factors  as  for 
example,  the  nunber  of  records  in  the  data  base.  An  average  access  time 
for  a  sequential  search  is  directly  related  to  the  number  of  records. 
Bor  an  indexed  sequential  access  only  a  portion  of  the  access  time  (the 
table  search)  may  be  directly  attributable  to  the  nunber  of  records. 

The  next  quantitative  step,  to  a  randan  data  base,  is 
expected  to  be  more  difficult  to  scale,  as  the  measurement  parameters  of 
a  randan  data  base  configuration  are  highly  dependent  on  data  base  usage 
and  organization  requirements.  However,  given  a  constancy  in  data 
structure,  measurement  is  still  possible.  Considering  a  hashing  approach 
to  address  determination,  access  time  difference  between  the  indexed 
sequential  and  random  access  methods  is  the  difference  between  the 
average  table  search  time  and  the  time  necessary  to  execute  the  hashing 
algorithm  (including  time  to  resolve  duplicated  references) . 

Another  method  for  scaling  oanplexity  is  the  depth 
and  oanplexity  of  the  data  structure  vdiich  describes  the  relationships 
between  data  items.  Any  nunber  of  possibilities  exist  with  this  approach. 
A  hierarchical  data  base  could  be  scaled,  regardless  of  access  method,  by 
limiting  the  nunber  of  immediate  successors  to  a  node  or  the  branch 
points  at  a  given  level.  Alternatively,  the  nunber  of  levels  could  be 
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scaled.  As  yet  another  example,  for  a  direct  access  network,  the  pointer 
chain  oould  be  limited  to  only  the  forward  direction, 
b.  Data  Base  Size 

Aspects  of  data  base  size  are  relatively  easy  to 
scale,  requiring  merely  numerical  quantification  of  appropriate  elements. 
The  various  size  elements  subject  to  scaling  in  a  data  base  are: 

1)  Number  of  Files 

The  nunber  of  files  nay  be  scaled  by  applying  a 
straight  percentage  to  the  total  number  of  files.  (Unless  all  files  are 
of  equal  length,  the  total  data  base  size  will  not  necessarily  be  scaled 
by  this  same  percentage.) 

2)  Length  of  Files  (numbers  of  records) 

The  number  of  records  may  be  reduced  and  the 
scale  factor  determined  frcm  a  ratio  of  the  nunber  of  records  remaining 
(in  all  files)  to  the  total  nunber  of  records. 

3)  Number  of  Access  Keys 

Scaling  oould  be  applied  by  limiting  access  to 
data  base  through  a  single  prime  key.  The  scale  factor  in  this  case  is 
of  course,  related  to  the  number  of  secondary  keys  in  the  final  system. 

4)  Nunber  of  Fields 

The  nunber  of  fields  in  ail  1  records  of  a  given 
type  may  be  scaled  as  a  ratio  of  the  nunber  of  fields  remaining  to  the 
total  nunber  of  fields. 

5)  Length  of  Record/Field 

The  length  of  a  record  may  be  scaled  by 
considering  a  ratio  of  the  nunber  of  characters  in  the  scaled  record 
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compared  to  the  total  number  of  characters.  The  same  scaling  could  be 
applied  to  each  or  selected  fields. 

It  is  possible  of  course,  to  scale  access 
.complexity,  data  structure  complexity  and  size  elements  in  combination, 
but  measurement  of  scaling  results  then  becomes  increasingly  difficult. 
For  example,  how  vould  one  compare  a  sequential  access,  hierarchical, 
single  key  data  base  to  a  multifile  relational  data  base?  As  has 
already  been  implied,  there  is  mutual  dependence  between  the  scaling 
factors  of  type  and  those  of  size.  All  of  these  relationships  will  be 
examined  at  the  point  in  the  study  where  factor  quantification  is 
addressed. 

A  final  point  on  data  bases  and  scaling. 
Scaling  of  access  type  and  to  some  degree,  data  structure  complexity,  may 
be  restricted  if  the  system  being  developed  is  expected  to  use  an 
existing  data  base  management  system.  Even  the  most  sophisticated  and 
general  purpose  system  is  restricted  in  the  organization  of  the  data  base 
it  can  manipulate.  Scaling  of  data  base  complexity  could  at  some  point, 
involve  a  data  base  management  system  customized  to  the  scaled  system. 

2.  Performance 

Performance  or  efficiency  objectives  such  as  response  times  and 
throughput  rates  under  a  variety  of  workload  and  configurations  are  an 
important  part  of  most  system  designs.  Efficiency  can  rarely  be 
specified  as  an  absolute  because  it  is  influenced  by  such  factors  as  the 
hardware  configuration,  telecorimrucation  line  speeds,  the  efficiency  of 
all  other  concurrently  executing  programs  and  the  number  of  active 
terminal  users,  to  name  a  few. 
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Performance  may  be  interpreted  as  the  technical  equivalent  of 
the  economic  notion  of  value.  That  is,  performance  is  what  nrakes  a 
system  valuable  to  its  user.  Like  value,  the  concept  of  performance  is  a 
subjective  one.  This  means  that  different  people  tend  to  use  different 
performance  indices  in  assessing  systans.  However,  it  is  often  possible 
to  translate  subjective  definitions  of  performance  into  purely  technical 
terms,  Which  can  sometimes  be  quantified  and  therefore  objectively 
evaluated . 

These  elements  may  be  considered  to  be  scaling  elements  and 
thus  developed  and  measured  for  scaled  system  use;  either  to  be  scaled, 
or  to  measure  the  effect  of  scaling. 

The  most  canton  classes  of  quantitative  performance  indices  for 
computer  systems  are: 

a.  Productivity 

Productivity  is  generally  defined  as  the  volume  of 
information  processed  by  the  system  in  a  unit  time.  One  measure  of 
productivity  is  the  throughput  rate,  which  during  a  given  interval  of 
time,  is  the  average  rate  at  which  jobs  are  completed  by  the  system  in 
that  interval. 

Throughput  may  be  scaled.  If,  for  example,  the  full  scale 
system  is  to  process  2000  messages  per  day,  the  scaled  system  might  be 
required  to  process  only  500.  This  system  would  be  throughput  scaled  to 
25%  of  the  full-scale  system. 

Throughput  is  of  course,  a  result  of  nearly  every  aspect 
of  a  system  configuration;  frcm  the  hardware  itself  to  the  functions  the 
system  is  required  to  perform  to  the  typical  set  of  jobs  requiring  system 
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resources.  The  system  configuration  fran  both  a  hardware  and  software 
viewpoint  will  be  discussed  later.  In  general  terms  however,  consider 
how  throughput  might  be  scaled: 

1)  System  capacity.  As  the  maximum  rate  vAiicih  a  system 
can  perfor  sock,  capacity  has  a  direct  result  on  throughput.  The  scaled 
system  might  handle  only  jobs  with  primary  memory  requirements  of  10k  as 
opposed  to  100k  for  the  full-scale  systan;  or  jobs  using  less  than  30 
seconds  of  processor  time  versus  two  minuces;  or  those  using  only  one 
printer  and  a  card  reader  rather  than  the  several  I/O  devices  jobs  on 
the  full-scale  system  might  require. 

i’)  System  job  mix.  Although  the  full-scale  system  might 
be  required  to  process  seme  number  of  job  types  arriving  at  randan,  the 
scaled  system  job  mix  might  be  structured  for  optimum  performance. 
Depending  on  the  application,  this  might  mean  grouping  all  jobs  of  type  A 
together.  Conversely,  in  a  multiprocessing  environment,  since  all  jobs 
of  the  same  type  might  compete  for  the  same  resources,  types  A  and  B 
might  be  alternated  in  the  job  stream.  As  a  final  example,  one  could 
scale  by  configuring  for  average  expected  work  load  rather  than  peak 
load. 

b.  Responsiveness 

The  term  responsiveness  can  be  defined  as  the  time  between 
the  presentation  of  an  input  to  the  system  and  the  appearance  of  the 
corresponding  output.  A  measure  of  responsiveness  is  the  response  time, 
which  is  the  time  elapsed  between  entering  a  request  and  the  computer's 
acknowledgement  of  it.  In  general,  the  response  time  depends  or.  the 
request,  on  the  system,  and  on  the  work  load  in  the  system  at  the  time 


the  request  is  entered.  Nevertheless,  response  time  is  a  valid  parameter 
to  scale.  We  might  require  the  target  system  to  support  20  analysts  with 
5  seconds  response.  For  scaled  systan  development,  a  response  time  of  15 
-seconds  for  5  users  might  be  adequate.  In  such  a  case,  the  system  would 
be  scaled  25%  with  respect  to  number  of  users  and  33%  with  respect  to 
response  time. 

A  better  term  would  be  interactive  responsiveness  (the 
inverse  of  response  time)  car  the  number  of  responses  per  unit  time.  This 
keeps  a  consistency  in  terminology  whereby  scaling  refers  to  reducing  the 
value  of  a  parameter.  In  terms  of  this  example,  by  scaling  interactive 
responsiveness  we  are  accepting  4  responses  per  user-minute  as  compared 
to  12. 

Below  a  certain  threshold  on  the  low  end  of  the  scale, 
human  users  can  no  longer  appreciate  a  reduction  in  response  time  (an 
increase  in  interactive  responsiveness) .  At  the  other  extreme,  at  seme 
point  response  times  get  unacceptably  long  and  the  level  of  user 
satisfaction  drops  to  the  point  where  longer  response  times  make  no 
difference.  Since  even  on  a  scaled  system,  user  satisfaction  may  be  of 
acme  importance,  the  degree  to  which  responsiveness  is  scaled  should  be 
limited  by  the  characteristics  of  the  users, 
c.  Utilization 

The  term  utilization  is  generally  defined  as  the  ratio 
between  the  time  a  specified  part  of  the  system  is  used  during  a  given 
interval  of  time.  Examples  of  utilization  include  hardware  module  (CR, 
memory,  I/O  channel,  and  I/O  device)  utilization,  and  the  utility  package 
utilization. 
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Modules  may  be  linearly  scaled  as  the  ratio  between 
proposed  and  actual  module  utilization.  The  scale  factors  rtey  be 
measured  in  terms  normal  for  the  module,  e.g.,  CPU  utilization  is 
measured  in  instructions  per  time,  manory  utilization  is  measured  as  a 
percentage  of  total  memory  available,  I/O  channels  as  either  a  data  rate 
or  channel  ratio,  etc. 

As  am  example,  we  might  scale  utilization  by  requiring 
that  the  developmental  system  require  utilization  of  only  50%  of 
available  capacity.  It  has  been  generally  shown  that  this  scaling  of 
performance  requirements  will  result  in  a  development  cost  one-third  that 
of  a  system  requiring  90%  utilization  of  resources  (Barry  Boehm, 
Practical  Strategies  for  Developing  Large  Software  Systems) .  In  this 
case,  the  system  would  be  utilization-scaled  to  55%  of  that  of  the  target 
system. 

d.  Operating  System/Organization 

Performance  scaling  may  be  accomplished  on  a  more 
fundamental  (and  probably  less  quantifiable)  level  by  several  other 
methods.  Although  the  parameters  mentioned  be  lew  represent  a  mode  of 
operation  rather  than  a  measurable  ratio  and  are  not  always  the  object  of 
a  design  effort,  they  do  affect  the  total  system  effort.  Hence  the 
choice  of  one  mode  over  another  is  a  valid  method  to  scale  performance. 

1)  Processing  mode.  Several  possibilities  cone  to  mind 
here.  Consider  a  system  where  batch,  interactive  and  real  time 
requirements  must  all  be  supported.  Advantages  in  terms  of  development 
time  would  certainly  accrue  to  a  scaled  system  vhich  considered  merely  a 
single  mode.  Similarly,  a  real  time  system  such  as  a  tracking  network 


B-17 


could  be  scaled  with  a  batch  system  vdiich  used  simulated  input  data. 


2)  Operating  system.  Although  the  choice  of  processing 
modes  is  certainly  dependent  on  the  operating  system  (or  vice  versa) ,  the 
-operating  system  presents  other  ways  to  scale.  While  the  ultimate  system 
might  require  a  custom  operating  system,  scaling  could  be  acocmpi ished  ty 
choosing  an  off-the-shelf  system  or  by  modifying  an  existing  one.  Given 
that  an  existing  operating  system  is  to  be  used,  one  could  scale  by 
leaving  unnecessary  functions  in  the  executive  of  the  scaled  system. 

3)  Interrupt  processing.  Closely  related  to  other  areas 
such  as  choice  of  processing  mode,  nimber  and  type  of  peripherals,  systen 
functions,  etc.,  the  mechanisms  for  interrupt  handling  may  be  considered 
as  a  separate  parameter.  Several  different  approaches  are  possible.  FOr 
example,  certain  (or  all)  interrupts  might  be  ignored  until  the  CRJ  is 
free.  Alternately,  a  priority  interrupt  scheme  could  be  scaled  with  a 
simple  system  of  queued  interrupts. 

There  are  a  number  of  other  performance  af  fee  tors 
such  as  the  ease  of  use  of  a  system,  the  btructuredness  of  a  program  or 
of  a  language,  and  the  power  of  an  instruction  set.  However,  they  are 
not  considered  in  the  scaling  process  because  they  are  difficult  or 
impossible  to  quantify. 

3.  Functionality 

Large  programs  are  often  decomposed  into  a  set  of  interacting 
functional  components  (e.g.,  modules,  procedures,  subroutines,  etc.). 
This  principle  by  which  program  concepts  evolve  in  a  natural,  structured 
way  emerged  from  Dijkstra’s  work  in  the  "The  Multiprogramming  Systan." 
He  conceived  that  a  program  could  be  organized  into  hierarchical  levels 
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of  support.  The  principle,  known  as  levels  of  abstraction,  formed  the 
basis  for  what  has  since  beocme  known  as  structured  programing .  At  each 
level  of  abstraction,  it  is  useful  to  study  the  needs  of  the  problem, 
that  is,  to  identify  all  the  relevant  elements  of  control  and  data  and 
the  relationships  between  them. 

Systan  structure  refers  to  the  way  in  which  cxmplex  functions 
and  interrelationships  may  be  characterized  in  terms  of  successively 
simpler  components  sometimes  called  modules.  Structure  primarily 
manifests  itself  in  terms  of  relationships  such  as  adhesiveness  and 
coupling  within  and  among  the  systems  modules,  the  architecture  of  the 
functions  and  data  flows,  and  the  information  structures.  Each  component 
forms  a  natural  unit  on  which  to  focus  attention  when  attempting  to  scale 
the  system.  Independence  of  the  modules  determines  the  modularity  of  a 
system. 

Usually,  structured  software  is  organized  into  a  master  module 
which  calls  subordinate  modules,  vrfiich  in  turn  link  to  modules  which  are 
further  subordinate  and  so  on  down  the  hierarchical  or  functional  chain. 
In  principle  then,  the  abstract  description  of  a  given  component  will 
embody  information  about  the  entire  chain  of  its  subordinate  components. 

Functional  scale  factors  will  depend  on  the  degree  of 
modularity  developed  in  a  hierarchical  system.  The  applicability  of 
modular  scaling  would  be  dependent  upon  the  type  and  degree  of  coupling 
between  modules  and  levels  of  modules.  There  are  three  types  of  coupling 
to  consider: 

a.  Data  coupling  -  a  form  of  coupling  caused  by  an 
intermodule  connection  that  provides  output  frcm  one  module  which  serves 
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as  input  to  another  nodule. 


b.  Oontrol  coupling  -  a  form  of  coupling  in  v^iieh  there  is  a 
connection  between  two  nodules  that  oommoicates  oontrol. 

c.  Hybrid  coupling  -  a  strong  form  of  coupling  that  occurs 
vhen  one  nodule  modifies  the  procedural  concents  of  another  module. 

The  significance  of  coupling  with  respect  to  scaling  is 
determined  by  the  direction  and  strength  of  the  connection.  If  the 
functional  coupling  between  levels  is  weak,  then  the  details  in  the 
description  of  the  lower  level  modules  rapidly  beoane  insignificant  with 
respect  to  the  higher  levels.  In  this  case  a  level  in  the  calling 
hierarchy  may  correspond  fairly  closely  to  a  level  of  functional 
description  and  scaling  by  modular  elimination  of  a  horizontal  module 
chain,  (function  level)  is  feasible  and  is  represented  by  the  outline  B 
of  Figure  3.  This  of  course  corresponds  to  the  elimination  of  seme 
number  of  primitive  functions  across  the  entire  systan.  The  percentage 
of  functions  retained  could  be  considered  the  scale  of  the  system. 

In  the  opposite  case,  when  the  functional  coupling  between 
levels  is  strong,  scaling  by  eliminating  bottom  to  top  serial  structures 
would  be  the  indicated  method.  This  would  be  analogous  to  the 
elimination  of  an  entire  subsystem  and  is  represented  by  the  outline  A  in 
Figure  3.  Since  structured  design  carries  a  strong  preference  for 
vertical  coupling  and  requires  the  avoidance  of  carpi ica ted  coupling 
schemes,  such  as  hybrid  coupling,  functional  scaling  appears  to  be  a 
practical  method  for  top-down  structured  designs.  Consider  the  following 
examples  of  functional  scaling: 
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Figure  3 


a.  Eliminate  performance  monitors  throughout  the  system. 

b.  Eliminate  utilities  which  would  provide  the  user  with 
transparency  of  data  (format  control,  aode  translation,  interfacing, 
etc. ) . 

c.  Eliminate  all  non-standard  06  requirements. 

d.  Eliminate  non-critical  ancillary  functions. 

e.  Implement  select  disjoint  subsystems  rather  than  the 
integrated  system. 

4.  Security 

The  term  security  can  be  defined  as  the  extent  to  which 
unauthorized  access  to  software  or  data  by  unauthorized  persons  can  be 
controlled.  A  user  should  be  able  to  create  and  manipulate  various  types 
of  resources  and  delegate  the  access  rights  to  a  resource  to  other  users. 

A  legitimate  user  of  a  resource  is  one  who  has  either  created  it,  or 
obtained  permission  to  use  it  from  another  legitimate  user.  A  user 
should  not  be  able  to  disrupt  the  processing  of  another  user  in  any 
unauthorized  way,  as  for  example,  causing  him  denial  of  service. 

The  degree  of  security  provided  for  software  and  data  is 
determined  by  the  scope  of  access  control  and  the  completeness  of  access 
audit.  Access  control  consists  of  those  attributes  of  software  that 
restrict  access  tc  and  manipulation  of  programs  and  data.  Access 
auditing  is  the  procedure  whereby  an  historical  record  is  maintained  of 
both  successful  and  unsuccessful  attempts  to  access  restricted  data. 

Security  may  be  considered  a  valid  parameter  for  scaling  when 
the  scaled  system  will  be  developmental  in  nature  and  when  either 
adequate  physical  safeguards  may  be  substituted  for  the  full-scale 
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software  security  procedures  or  the  data  to  be  protected  is  simulated  or 
is  non- sensitive  public  test  data. 

One  approach  to  scaling  security  is  to  modify  the  file 
.protection  procedures  implemented  to  control  access.  (Methods  for 
identifying  the  legitimate  user  will  be  discussed  later.)  A  system  may 
be  considered  scaled  with  respect  to  security  if  it  encompasses  a  file 
protection  methodology  less  restrictive  than  the  full-scale  system.  The 
following  list  (by  Randall  Jensen  in  Software  Engineering)  surmarizes  six 
levels  of  file  protection  starting  with  the  least  sophisticated: 

a.  No  protection  -  file  access  and  all  operations  provided  any 

user. 

b.  "total  protection  -  no  file  sharing  at  all. 

c.  All  or  nothing  -  if  access  granted,  then  all  operations 

permitted . 

d.  Controlled  sharing  -  a  user  is  granted  access  rights  vhich 
are  the  minimum  necessary  to  accomplish  the  specified  task. 

e.  Specified  access  -  access  to  each  object  is  restricted  and 
access  rights  are  owner-definable  in  several  different  contexts: 

1)  User-dependent  -  access  rights  are  leased  on  the 
identity  of  the  user  requesting  access. 

2)  Context-dependent  -  access  is  granted  subject  to  the 
environment  (type  of  terminal,  location,  time  of  day,  etc.). 

3)  Data-dependent  -  access  to  a  record  is  controlled 
depending  on  the  contents  of  the  record . 

f.  Post-access  control  -  users  may  be  granted  access,  subject 
bo  the  purpose  for  vhich  it  is  to  be  used  after  access  is  accomplished. 
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Aside  from  the  types  of  operating  system-provided  protections  described 
above,  the  "size"  of  the  access  specification  nay  be  scaled.  One  method 
of  specifying  access  is  with  the  access  natrix,  the  dimensions  of  vhich 
-(which  nay  be  altered)  are  determined  in  cne  direction  by  the  nnriber  of 
users,  processes,  or  procedures  which  have  access  restrictions  and  in  the 
other  direction  by  the  number  of  objects  for  which  access  is  restricted. 
Changing  either  dimension  necessarily  scales  the  system  with  regard  to 
security. 

In  addition  to  the  file  protection  scheme  provided  by  all 
sophisticated  executives,  one  must  also  consider  the  classification  of 
the  data  and  the  clearances  needed  by  the  users.  Classified  information 
is  ocnnonly  protected  by  a  trusted  subsystem  which  evaluates  (beyond  the 
OS)  the  protection  afforded  and  access  granted  to  various  classes  of 
sensitive  data  and  programs.  Implementation  of  this  subsystem  provides 
several  new  methods  for  scaling. 

a.  lb  access  data,  both  the  user  and  the  terminal  must  have 
access  rights.  To  scale,  we  might  grant  all  terminals  access  to 
everything  and  provide  only  physical  security  for  access  to  the 
terminals.  Alternatively  we  could  allow  all  users  free  rights  to  any 
data. 

b.  The  number  of  access  types  (by  user,  by  classification 
level,  or  by  compartment)  could  be  reduced. 

c.  Different  data  sets  (further  scaled  of  any  method 
described  under  data  bases)  could  be  provided  for  each  access  clearance. 

d.  The  granularity  of  the  data  base  could  be  modified  for 
each  level  of  access.  In  other  words,  one  user  class  might  be  granted 
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full  access  to  all  information  While  another  might  be  restricted  to 
record  level.  An  alternate  method  would  be  to  restrict  access  only 
beyond  a  certain  file  structure  level  across  the  entire  class  of  users. 

e.  Codewords  and  special  handling  could  be  eliminated. 

f.  The  audit  trail  that  might  be  required  of  the  full-scale 
system  could  be  ignored  for  the  scaled  implementation . 

g.  The  authentication  approach  (to  include  passwords, 
recording  of  access  failures,  log-on  procedures,  and  terminal 
authentication)  aould  be  simplified  or  eliminated. 

h.  Through  the  use  of  simulated  data  or  limited  transmission, 
a  full-scale  requirement  for  encryption  of  data  (by  software)  could  be 
suspended . 

5.  Maintainability 

The  term  maintainability  can  be  defined  as  the  effort  required 
to  locate  and  fix  an  error  in  an  operational  program  and  is  a  technically 
valid  area  for  scaling,  since  the  implementation  of  maintainability 
incurs  increased  software  development  cost  and/or  time.  The  approaches 
which  might  be  employed  to  scale  this  parameter  are  closely  allied  to 
those  for  scaling  functionality  in  that  functional  requirements  of  the 
system  are  eliminated  or  simplified.  The  difference  is  that  the 
functions  included  to  enhance  the  maintainabil  ity  of  a  system  would  not 
normally  be  the  target  of  a  design  effort  but  rather  tools  to  make 
subsequent  enhancements  of  the  final  product  a  routine  process.  With 
this  in  mind,  one  could  argue  that  since  the  scaled  system  is  merely  a 
step  toward  the  final  product,  modules  whose  purpose  is  to  enhance 
maintainability  could  be  excluded  from  the  scaled  effort.  Seme  examples 
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would  be: 

a.  Simplify  process-error  handling. 

b.  Eliminate  restart/ recovery  procedures. 

c.  Eliminate  modules  to  reject  and/or  correct  bad  data. 

d.  Eliminate,  reduce  or  modify  fault  location/ trap  software. 

e.  Exclude  software  to  monitor  system  performance  and  gather 
statistics . 

f.  Reduce  back-up  procedures  to  a  minimun. 

g.  Include  additional  software  diagnostic  aids,  program 
tracers,  and  interactive  debuggers.  This  is  an  unusual  situation  in  that 
additions  made  to  the  scaled  system  would  reduce  the  development  effort 
by  enhancing  its  effective  implementation.  The  full-scale  system  vould 
probably  contain  built-in  diagnostic  aids  as  well,  but  to  a  lesser 
degree. 

Thus,  the  design  of  the  scaled  system  may  eliminate  docunentation, 
recovery,  and  reconfiguration  programs  at  the  specific  risk  that  the  lack 
of  these  elements  may  in  fact  prolong  the  project  rather  than  enhance  it. 
This  risk  may,  in  seme  cases,  be  sufficient  to  preclude  the  scaling  of 
maintenance  functions  in  the  scaling  process. 

6.  Reliability 

The  term  reliability  refers  to  the  extent  to  vfoich  a  program  can  be 
expected  to  perform  its  intended  function  with  consistency  and  required 
precision.  Reliability  is  the  product  of  the  error  tolerance, 
simplicity,  accuracy  and  of  course,  consistency  of  the  software  and  data. 

The  scaling  of  reliability  is  a  tricky  business.  While  it  is 
certainly  valid  to  state  that  the  reliability  standards  in  the  scaled 
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system  may  be  relaxed  (and  therefore  scaled),  acme  of  the  methods  vhereby 
this  could  be  achieved  \«rould  be  poor  practice  in  any  design  effort, 
scaled  or  not.  Same  of  these  would  be  inconsistency  in  calling  sequence 
and  I/O  conventions,  non-standard  data  declaration  and  non-standard 
design  structure.  Mere  feasible  alternatives  would  include; 

a.  Reduction  of  precision. 

b.  Elimination  of  error  detection  software  geared  to  errors 
which  vould  occur  infrequently  in  practice  or  not  at  all  in  the  input  to 
the  scaled  system. 

c.  Use  of  fast,  easy  (and  not  necessarily  accurate) 
approximation  functions  and  algorithms. 

d.  Relaxation  in  enforcement  of  coding  standards.  This 
approach  might  be  considered  in  the  case  vhere  the  scaled  design  is  a 
skeleton  of  the  final  system  and  recoding  would  be  necessary  anyway. 
When  the  scaled  approach  is  to  implement  a  complete  subsystem,  leaving 
the  door  open  for  recoding  the  subsystem  is  not  a  good  idea.  Whether 
this  approach  vould  indeed,  scale  reliability  is  probably  a  function  of 
the  quality  of  the  programmers .  Allowing  each  programmer  to  "do  his  own 
thing"  would  speed  up  the  development  effort  but  might,  if  the 
programmers  were  good,  not  significantly  affect  the  reliability  of  the 
result.  Whether  a  case  can  be  made  for  reduced  reliability  regardless  of 
programner  quality  is  a  question  for  further  research. 

7.  Programming  Language 

Two  aspects  of  progr arming  language  suitable  for  scaling  are 
language  selection  and  implementation. 
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Each  individual  progr arming  language  has  its  unique  strengths 
and  weaknesses .  Implementation  of  a  system  in  a  scaled  manner  affords 
the  freedom  to  select  an  optimal  language  for  the  scaled  system  even 
.though  that  language  may  differ  from  the  one  chosen  for  the  target 
system.  As  an  example,  an  assembly  or  machine-order  language  selected  by 
necessity  for  a  real-time  message  handling  system  may  be  replaced  by  a 
structured,  higher-order  language  such  as  ALGOL  or  PL/l  for  the  scaled 
version  of  that  system.  Such  a  selection  might  be  made  based  upon 
considerations  of  top-dcwn  design,  code  readability,  and  modifiability, 
thereby  contributing  to  accelerated  program  development. 

With  regard  to  programming  language  implementation,  the 
language  itself  might  be  scaled.  Consider  a  high-level,  user-oriented 
interactive  query  language  designed  to  implement  a  data  base  management 
system.  Scaling  might  be  accomplished  by  not  implementing  the  query 
language  at  all  in  the  scaled  system  (the  functions  would  be  provided  by 
an  experienced  programmer) ,  by  implementing  a  subset  of  the  language,  or 
by  implementing  a  version  with  cruder  syntax  which  still  supports  the 
essential  query  requirements. 

In  the  case  of  a  compiled  language  implementation,  an 
adaptation  of  scaled  system  methodology  is  in  common  practice  today. 
When  a  compiler  is  developed,  a  compiler  supporting  a  subset  language  is 
generally  implemented  first.  Iterative  enhancements  of  the  baseline 
language  are  subsequently  achieved  through  the  use  of  the  compiler  itself 
to  generate  new  compiler  code.  In  this  way,  increased  productivity  is 
realized  through  the  use  of  a  higher-order  language. 
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Another  possible  approach  to  language  scaling  would  be  in  the 
case  where  firmware  (PROM  -  programmable  read-only  memory)  is  to  be 
employed  in  the  final  design.  To  speed  the  development  effort,  same 
-functions  to  be  ultimately  supported  by  firmware  might  be  implemented  by 
software  written  in  a  high-level  language. 

Closely  allied  to  functionality  scaling,  would  be  the  choice  to 
implement  the  support  of  a  single  ocmpiler/language  (COBOL,  FORTRAN, 
PL/ 1,  etc.)  for  a  system  which  must  support  general  purpose  computing  or 
to  implement  a  single  process-oriented  language.  An  example  of  the 
latter  might  be  a  case  where  the  functions  of  message  editor  and  text 
editor  would  be  supported  by  the  system- standard  editor  for  the  scaled 
system. 

8.  Hardware  Configuration 

The  choice  of  individual  hardware  components  and  their 
configuration  is  an  important  aspect  of  the  scaled  systems  methodology. 
Significant  savings  in  schedule,  effort  and  aost  may  be  achieved  by 
reconfiguring  the  target  systems  hardware  or  by  selecting  an  alternate 
operational  environment  for  the  scaled  systems  effort.  A  hardware 
configuration  is  the  arrangement  and  nunber  of  physical  system  components 
that  collectively  comprise  a  system's  "hardware",  e.g.,  central 
processing  units  (CRj's) ,  oore  memory,  peripheral  memory,  input-output 
(I/O)  devices,  oormunications  devices  and  the  wired  connections  between 
them. 

A  hardware  configuration  may  be  scaled  at  the  most  elementary 
level  by  reducing  either  the  nunber  of  component  types  or  the  total 
number  of  components •  Either  approach  reduces  total  system  complexity. 
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Sane  devices  however,  serve  to  reduce  total  system  complexity  toy  their 
presence  and  do  not  lend  themselves  to  elimination  for  purpose  of 
scaling.  Examples  would  include  intelligent  terminals  and  peripheral 
-controllers  or  I/O  processors. 

Consider  the  following  list  of  feasible  hardware  modifications 
for  scaling: 

a.  Reduce  nirriber  of  CTO's.  A  system  vhich  is  ultimately  to 
be  multiprocessing  could  be  scaled  as  a  single  processor. 

b.  The  choice  of  CTO  might,  in  fact,  be  different  fran  that 
used  for  the  full-scale  system.  (Conceivably,  CTO  choice  could  be  an 
open  question  at  the  time  the  scaled  system  is  developed . )  The  CTO  used 
for  the  scaled  system  might  be  one  with  which  the  design  team  is 
familiar,  one  vhich  is  a  substitute  for  a  device  under  development  or  one 
vhich  is  "lesser"  in  terms  of  cost,  capacity,  speed  car  ward  size. 

c.  Number  and/or  type  of  peripherals. 

d.  Front-end/back-end  systems  could  be  scaled  by  preliminary 
work  with  only  the  front-end  processor . 

e.  Increase  the  memory  capacity  of  the  scaled  system.  This 
could  speed  up  the  development  effort  by  eliminating  oar  reducing  page 
faults  and  core  swapping  or  reduce  the  need  for  overlays. 

f.  Reduce  the  complexity  of  interrupt  handling.  Ftor  example, 
use  queued  interrupts  instead  of  prioritizing  them.  In  the  case  of 
real-time  systems,  eliminate  real-time  interrupts  ty  eliminating  the 
input  devices  (simulate  the  data)  or  by  considering  them  normal  polled 
input  devices. 


B-30 


Hybrid  device  types,  complex  devices,  and  in-development  devices 
tend  to  increase  system  complexity  and  stretch  out  the  development 
schedule.  These  could  be  avoided  or  replaced  in  the  scaled  effort. 
Problems  with  such  devices  can  be  reduced  car  eliminated  by  substituting 
existing,  simpler,  plug-compatible  devices  for  them.  Scaled  effort, 
schedule,  and  cost  can  be  reduced  by  replacing  complex  devices  with 
simpler  ones,  in-development  devices  with  existing  ones,  real-time 
devices  by  software  simulation,  and  interrupt  devices  with 
processor-control  led  ones.  As  noted,  in  seme  cases,  it  may  be  desirable 
to  add  hardware  such  as  memory  to  reduce  the  degree  of  core  utilization 
required  or  monitoring  hardware  to  aid  system  evaluation,  validation  and 
verification. 

Noteworthy  is  the  fact  that  in  many  cases  components  do  not  have  to 
be  physically  removed  from  a  hardware  configuration  but  merely  logically 
disconnected  or  bypassed.  Finally,  considering  the  operational 
environment,  it  should  be  noted  that  multi-site  and  imlti-national 
development  facilities  could  be  scaled  by  physically  limiting  the 
development  to  a  single  site,  thus  reducing  complex  ocmnunications 
requirements  altogether. 

With  regard  to  communications  between  the  processor  and  peripherals, 
the  carmunications  network  itself  may  be  scaled. 

a.  Reduce  the  number  of  nodes  in  the  system. 

b.  Scale  satellite  communications  with  hard-wired,  local 

systems. 

c.  Reduce  the  number  of  levels  or  complexity  of  a 
oarrmun  i  cat  ions  network  or  hierarchy. 
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d .  Provide  an  equal  level  of  service  to  each  node  rather 

than  prioritized  service. 

e.  Provide  one-way  rather  than  two-way  message  switching  or 
oarmunications . 

f.  Reduce  the  complexity  of  the  logical  hardware  paths 
between  the  sender  and  receiver. 

g.  Provide  a  single  comunications  path  rather  than  include 
backup  (alternate)  links. 

In  the  case  of  relatively  small,  firmware- based  anbedded  systems 
such  as  on-board  avionics  systems,  the  facilities  of  a  mainframe  to 
reduce  the  need  for  a  high  level  of  machine  utilization,  to  emulate  I/O, 
and  to  support  online,  interactive  program  tracing  and  debugging  would 
serve  as  an  aid  to  the  development  of  software  for  such  projects.  Again, 
in  this  case,  “scaled"  would  not  necessarily  mean  "smaller". 

E.  CONCLUSION 

In  order  to  derive  a  scaled  system,  it  is  necessary  to  take  the 
functional  specification  of  the  full  size  system  and  apply  seme  set  of 
scale  factors.  These  factors  must  be  applied  toward  the  purpose  of 
simplifying  the  target  system  by  a  desired  degree.  The  applications  of 
the  scale  factors  should  be  in  accordewx^  with  a  programmed  set  of 
objectives  (not  necessarily  original  design  objectives)  so  that  the 
scaling  results  in  a  useful  product.  The  application  of  the  scale 
factors  to  the  functional  specification  should  result  in  a  scaled 
functional  specification  vhich  beocmes  the  master  document  for  the  design 
phase  of  the  scaled  system. 
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A  8vrnrarized  list  of  the  possible  scaling  factors  outlined  in  this 
report  follows: 

1 .  Data  base 

a.  Canplexity  and  type  of  access  method 

b.  Canplexity  of  data  structure 

c.  Size  elements  (nunber  of  files,  length  of  files,  etc.) 

2 .  Performance 

a.  Productivity/Throughput  (system  oapacity,  job  mix) . 

b.  Responsiveness 

c.  Utilization 

d.  Operation  System/Organization  (processing  mode,  custom  vs. 
existing  OS,  interrupts ) . 

3.  Functionality 

a.  Vertical  subsystem  scaling  (eliminate  subsystem,  utilities, 

etc. ) 

b.  Horizontal  scaling  (e.g.,  performance  monitors). 

4.  Security 

a.  File  protection  method. 

b.  Dimensions  of  access  matrix. 

c.  Number  of  data  sets. 

d.  Classification  level  of  users  and/or  terminals. 

e.  Granularity  of  data  access  control . 

f.  Number  and  types  of  access  classifications. 

g.  Codewords. 

h.  Audit  trail. 
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i.  Authentication. 


j.  Encryption. 

5.  Maintainability 

a.  Process-error  handling. 

b.  Restart./  recovery. 

c.  Data  correction. 

d.  Fault  detection. 

e.  Monitors. 

f.  Backup. 

g.  Development  aids. 

h.  Documentation . 

6.  Reliability 

a.  Precision. 

b.  Data  error  detection. 

c.  Approximation  algorithms. 

d.  Coding  standard  enforcement. 

7 .  Programming  Language 

a.  HDL  vs.  assembly 

b.  Language  subset 

c.  Single  vs.  multiple  languages 

d.  Replacement  of  firmware 

8.  Hardware  Configuration 

a.  Nurtoer  and  oanplexity  of  hardware 

b.  Number  of  CRJ's 

c.  Type  of  CPU 
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d.  Manory  capacity 

e.  Internets 

f.  Hardware  monitors 

g.  Nuriber  of  aannunications  nodes 

h.  Canplexity  of  ocnrmnication  nevrork 

i.  Level  of  service  to  peripherals 
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A.  ABSTRACT 


Scale  factor  metrics  for  each  scale  parameter  are  discussed  and 
defined  as  an  extension  of  the  results  of  subtask  1.1  of  the  Scaled  Systems 
research  project.  The  research  performed  under  subtask  1.2  of  that  project 
is  summarized. 

Scale  factors  are  measures  of  the  degree  to  which  system  parameters 
are  scaled.  Metrics  are  the  unit  measures  chosen  to  express  these  system 
parameters.  Using  appropriate  metrics,  scale  factors  are  defined  in  objec¬ 
tive,  quantifiable  terms  and  in  such  a  manner  as  to  be  indicative  of  their 
effects  on  cost,  schedule,  risk,  and  performance  of  scaled  vs.  full-scale 
systems. 
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B.  OBJECTIVE 


Hie  objective  of  this  research  is  to  establish  the  basis  through  which 
current  software  engineering  principles  may  be  applied  to  the  measurement  of 
system  attributes  so  that  appropriate  system  scale  factors  may  be  systematical¬ 
ly  determined. 

Scale  factors  relate  a  scaled  system  to  its  corresponding  full-scale 
version  based  ipon  the  criteria  of  cost,  performance,  and  development  schedule. 
Because  this  relationship  is  critical  to  forecasting  and  planning,  it  is  impor¬ 
tant  that  it  is  based  upon  a  sound  analytical  methodology  and  that  it  is 
accurately  expressed. 
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C.  DISCUSSION 


1.  Software  Parameters 

Through  the  research  of  software  scale  parameters,  various  aspects  of 
software  systems  were  identified  as  being  relevant  to  scaling.  These  aspects 
include  data  base,  performance,  functionality,  security,  maintainability,  relia¬ 
bility,  programming  language,  and  hardware  configuration.  Each  aspect  was  sub¬ 
sequently  broken  down  into  its  component  parts.  The  list  of  software  aspects 
and  their  component  parts  -  collectively  referred  to  as  "software  parameters"  - 
formed  the  basis  of  this  phase  of  the  research. 

2.  Metrics 

For  each  parameter,  an  attempt  was  made  to  identify  a  corresponding 
metric  suitable  for  calculating  scale  factors.  Very  few  parameters,  however, 
could  be  expressed  by  existing  metrics.  The  science  of  software  metrics 
is  still  an  infant  discipline  and  there  exist  only  a  few  software  metrics 
generally  accepted  as  such.  These  would  include  "Lines  of  Cbde"  (LOC) ,  "CHJ 
works  per  second"  (from  Capacity  Management  principles) ,  and  "Manmonth" 

(or  "manhour",  "marvday",  "manweek",  "manyear",  or  some  equivalent). 

3.  Direct  Metrics 

This  deficiency  of  currently  available  metrics,  however,  did  not  present 
a  major  obstacle  to  this  phase  of  the  research.  This  is  due,  in  part,  to  the 
fact  that  many  of  the  software  scale  parameters  themselves  imply  a  corresponding 
metric. 

Consider,  as  an  example,  the  case  of  data  base  size.  Components  of  data 
base  size  include  numbers  of  files,  record  types,  and  data  field  definitions, 
lengths  of  files,  records,  and  fields.  Each  of  these  components  describes  its 
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own  metric;  the  integral  number  of  files  is  the  metric  for  the  "number  of  files" 
component,  etc.  The  applicable  scale  factor  is  merely  the  val  le  for  the  scaled 
system  divided  by  the  value  for  the  full-scale  system.  This  computation  yields 
a  percentage  ratio  -  just  like  a  scale  ratio  -  that  is  conceptually  attractive 
and  easy  to  relate  to  and  comunicate. 

4.  Metric  Indices 

For  software  parameters  that  do  not  have  a  corresponding  software  metric 
and  do  not  themselves  imply  the  metric  (e.g.  complexity  of  access  method,  file 
protection  method),  the  formulation  of  an  appropriate  scale  factor  is  not  as 
straightforward.  In  such  cases  a  choice  must  be  made  between  alternative  scale 
factor  formulation  methodologies. 

5.  Interrelated  Indices 

One  convenient  alternative  methodology  involves  the  assignment  of  discrete 
metric  values  to  each  member  in  a  group  of  related  software  attributes.  Such 
metric  values  (or  indices)  could  be  assigned  differently,  depending  on  what 
they  are  related  to.  One  possible  method  which  has  been  rather  extensively 
used  involves  interrelating  the  attributes  with  each  other  on  a  relative  scale. 
An  application  of  this  scheme  oould  be  the  factoring  of  the  degree  of  file 
protection  under  the  software  aspect  of  security;  "no  file  protection"  would  be 
placed  at  one  end  of  the  scale  while  "total  file  protection"  would  be  placed  at 
the  other  end,  with  the  varying  degrees  of  file  protection  falling  in  between. 
"No  file  protection"  might  be  assigned  a  value  of  one  and  "total  file 
protection"  a  value  of  three;  thus,  if  no  file  protection  were  implemented  on  a 
scaled  system  emulating  a  full-scale  system  with  a  requirement  for  total  file 
protection,  the  component  scale  factor  would  be  computed  as  one  divided  by 
three,  or  33%. 
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6.  Global -Related  Indices 


A  variation  of  this  weighting  scheme  relates  the  component  parameters  to 
one  or  more  of  the  principal  global  software  system  aspects  -  cost,  schedule, 
risk,  and  performance.  Consider  these  relationships: 

Sequential  file  access  methods  would  scale  random  access  methods  by 
reducing  the  inherent  programming  and  data  structure  complexity  resulting  from 
the  use  of  record  dictionaries,  links,  and  pointers.  Substitution  for 
such  access  methods,  however,  would  also  scale  search  time  responsiveness  (an 
element  of  performance)  by  a  factor  determined  by  the  expected  number  of 
records  present.  These  interrelationships  will  be  studied  in  Task  2  of  this 
project. 

Another  example  arises  in  the  enhancement  of  an  operating  system  to 
support  a  particular  application.  The  enhancements  primarily  provide  ancillary 
functions  -  a  basic  scaled  capability  can  be  implemented  without  them.  To  ob¬ 
tain  the  source  code  to  the  operating  system,  become  familiar  with  it,  and  modi¬ 
fy  it  is  costly  and  time-consuming;  to  retain  the  vendor  to  perform  the  modifi¬ 
cation  is  similarly  expensive.  The  scale  factor  derived  through  the  use  of  the 
"off-the-shelf"  operating  system  can  therefore  be  computed  based  upon  the  cost 
and  schedule  savings  resulting  from  its  use.  Such  a  computation  would  probably 
be  easier  to  formulate,  communicate,  and  understand  than  attempting  to  determine 
a  scale  factor  based  upon  the  tecnnical  differences  between  the  operating  system 
and  its  modified  version. 

A  point  to  keep  in  mind  here  is  the  motivation  for  scaling  systems:  ac¬ 
hieving  cost-effective  system  development  with  quality  assurance.  -It  is  with 
this  perspective  that  global -related  scale  factors  are  constructed. 
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It  must  be  noted  that  this  report  does  not  purport  to  provide  an  authori¬ 
tative  definition  of  system  metrics  nor  even  scale  factor  metrics.  Its  intent, 
rather,  is  to  assemble  a  preliminary  set  of  metrics  to  provide  a  contnon  discus¬ 
sion  framework  for  scale  factoring  and  a  basis  for  subsequent  research  into  the 
measurement  and  development  of  scaled  systems. 

Continuing  research  of  software  metrics  will  be  beneficial  to  the  scaled 
system  project  as  well  as  the  software  engineering  community  through  the  ability 
to  better  quantify  software  system  attributes.  In  an  expanding  discipline,  def¬ 
initions  and  emphasis  tend  to  shift,  contributing  to  the  dynamic  nature  of  the 
terminology  and  technical  base,  Actual  metrics  and  relationships  may  therefore 
be  re-sculptured  as  this  project  progresses  toward  the  goal  of  achieving  an  un¬ 
derstandable  and  workable  methodology  for  scaled  systems  development. 
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D.  TECHNICAL  APPRCACH 

In  determining  scale  factors,  software  parameters  will  be  examined  in 
the  same  order  as  they  were  reported  in  Software  Scale  Parameters,  the 
report  delivered  under  subtask  1.1  of  this  research  effort  and  herein  referred 
to  as  "Report  1.1". 

For  each  aspect  of  software  systems  identified  for  scaling,  a  weight  will 
be  assigned  to  each  full-scale  function  within  the  range  of  that  capability, 
will  be  determined  what  part  of  each  function  is  implemented  by  the  scaled 
system.  The  full-scale  weights  and  scaled  values  are  each  added  up  and  then 
divided  to  obtain  the  scale  factor. 


1.  Data  Base 

a.  Complexity  and  type  of  access  method 

There  are  three  major  data  base  access  methods  that  are  to  be  con¬ 
sidered  for  scaling  purposes:  sequential  access,  indexed  sequential,  and 
direct  access.  Consider  the  average  access  times  for  a  file  of  records  using 
sequential  and  indexed  sequential  access. 

>  ^  u  ^  ^ 

w8  «  Ks‘  •  2  3  Ksn  Where  Kg'  ,  K~,  Kj,  Kare  constants,  W8  *  average 

'  access  time  for  sequential  access,  and  Wj  =  average 

Wj  -  Kjlog n  access  time  for  indexed  sequential  access  (includes 

table  search  time) . 


ratio  R  * 


=  ws 


Kg  n 


-  K- 
log  n 


Wj  Kj  logr  n 

For  some  n=n0,  R=l.  That  is,  for  a  data  base  of  n0  records,  the  sequential 
access  and  indexed  sequential  access  methods  yield  the  same  search  times. 


It 
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The  oould  be  determined  by  experimentation  or  experience. 

For  this  R  -  1,  1  =  „ 

log  nc 

K  -  log  nn 

Bo 

R  -  n  log  n0 
no 

'Define  R  as  the  relative  complexity. 

For  random  access, 

W  r  =  ah  +  (1-a)  Krn,  where  a  =  function  a  (£) , 

h  =  number  of  instructions  and  s  =  size  of  the  hashing  region 

in  the  hashing  algorithm 

ah  is  the  hashing  time  and  (1-a)  Krn  is  the  time  to  locate  an  empty 

space. 

as  !  -  0,  a  -  l 
as  £  -  1,  a  -  0 

That  is,  as  the  region  to  which  keys  are  hashed  becomes  denser,  i.e., 

1,  the  randan  access  method  approaches  the  sequential  method  because  the 

search  for  an  empty  spot  will  approach  a  sequential  search. 

The  "cost"  of  a  search  can  then  be  defined  as 

C  =  Cj  .Wj  +  Cs  .Ws,  where  C i  =  Cost  of  an  instruction, 

C g  —  Cost  of  storage,  and  Wj  and  Ws  are  the  number  of 
instructions  and  storage  for  a  given  method. 

The  "cost"  calculations  can  be  used  as  the  scale  metrics  and  can,  for 
a  given  operating  system,  use  the  pricing  algorithm  of  that  particular  system. 
The  above  formula  is  just  one  example  of  a  costing  algorithm.  Another  might  be 
C  -  Wx  .W8  . 

The  complexity  of  a  given  hierarchically  structured  data*base  will  be 
a  function  of  the  number  of  nodes,  N,  and  the  number  of  links  between  the 
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nodes,  L. 


For  a  hierarchical  structure, 

N-li  L  <  2N  -  3 

For  a  network  structure, 

N-l  <  L  <  N(N-l) 

2 

The  complexity  metric  will  be  of  the  form: 

Relative  complexity  =  i  links  per  node  (record) 

In  the  simplest  case,  with  the  only  links  being  between  parents  and 

children,  C  =  N  C  +  l  as  N  increases 

N-l 

In  the  most  complex, 

C  **  2N-3  =  2-3^  C  +  2  as  N  increases 

N  N 

In  a  network  structure,  the  most  complex  case  will  be: 

C  «  N (N-l)  »  N-l.  :  N 

2  2  2 

N 

L/N  can  be  viewed  as  an  average  number  of  links  per  node,  where  the 
more  links  a  given  node  can  have,  the  more  complex  is  the  implied  structure. 

An  absolute  complexity  might  be  defined  as  D=  depth,  the  number  of  levels 
on  the  tree,  or  N,  the  number  of  nod*'  . 

How  a  hierarchical  data  base  is  stored  will  be  related  to  complexity, 
as  well.  Sequential  listing  of  a  tree  structure  is  slow  compared  with  linked 
list  storage  but  the  lists  require  extra  storage  and  more  complex  programming. 

The  choice  of  storage  organization  for  a  network  structure  will  result 
in  the  same  variance.  These  interrelationships  will  be  studied  in  a  later  phase 
of  the  project. 

The  relational  representation  of  a  data  base  is  so  simple  that  the 
measure  of  the  complexity  of  any  given  relationally  structured  file  would  be  a 


C-ll 


linear  function  of  the  number  of  tables  and  rows.  It  can  be  thought  of  as  a 
tree  with  each  table  representing  a  parent  node  at  level  1  and  the  number  of 
links  equal  to  the  total  number  of  rows,  the  rows  being  on  level  2  of  the  tree. 

b.  Complexity  of  data  structure 

(1)  Hierarchical 

The  parameters  that  form  a  basis  for  scaling  are  the  number 
of  levels  of  tne  tree,  the  degree  (number  of  successors  to  a  given  node), 
and  the  total  number  of  nodes  (records) . 

(2)  Network 

In  the  network  model,  it  is  also  necessary  to  consider  the 
linkage  factor,  where  scaling  would  involve  limiting  the  number  of  logical 
links  between  the  nodes. 

c.  Size  Elements 

The  elements  of  data  base  size  lend  themselves  well  to  scale  factor¬ 
ing.  Because  each  is  a  quantum  entity,  the  resultant  scale  factor  may  be 
computed  as  a  fraction  in  which  the  value  for  the  scaled  system  is  represented 
in  the  numerator  and  the  value  for  the  full-scale  system  is  found  in  the 
denominator.  For  example,  the  scale  factor  for  "number  of  files"  is  found  by 
dividing  the  number  of  files  for  the  scaled  system  by  the  number  of  files  for 
the  full-scale  version.  Scale  factors  for  the  size  parameters  listed  in  Report 
1.1  are  as  follows: 

Parameter  Scale  Factor 


Number  of  Files 


Number  of  Files  (Scaled  System) 
Number  of  Files  (Full-Scale  System) 
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Length  of  File  (File 

Scaled  System) 

Length  of  Files 
(bytes) 

Length  of  File  (File 

Full-Scale  System) 

Length  of  Records  (File 

-  Scaled  System) 

Length  of  Records 
(bytes) 

Length  of  Records  (File 

-  Full-Scale"  System) 

Number  of  Fields  (Records  -  Scaled  System) 

Number  of  Data  Fields 

Number  of  Fields  (Records  -  Full-Scale  System) 

Length  of  Field  (Record 

-  Scaled  System) 

Length  of  Data  Fields 
(bytes) 

Length  of  Field  (Record 

-  Full-Scale  System) 

2.  Performance 

The  elements  of  performance  suitable  for  scaling  were  identified  in 
Report  1.1  as  productivity,  interactive  responsiveness,  utilization,  and 
operating  system  organization. 

a.  Productivity 

Productivity  is  a  common  measure  of  system  performance.  It  is 
composed  of  two  elements;  the  amount  of  work  that  can  be  physically 
accommodated  and  the  rate  at  which  it  is  ultimately  accomplished.  The 
first  element  is  described  by  the  system's  capacity  -  the  principal  factor 
limiting  workload.  The  second  element  is  described  by  the  systems  throughput. 
Relating  throughput  to  capacity  yields  an  efficiency  or  performance  index. 
Increasing  system  capacity  implies  acquiring  additional  hardware;  increasing 
throughput,  on  the  other  hand,  entails  obtaining  a  corresponding  increase  in  the 
operating  system’s  efficiency,  although  this  can  also  be  accomplished  through 
new  hardware. 
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(1)  Capacity 

Borrowing  from  Capacity  Management  technology,  a  system's  capac¬ 
ity  may  be  defined  as  the  amount  of  information  it  can  contain  at  any  certain 
period  of  time.  The  metric  used  to  measure  information  is  the  byte  (eight 
Boolean  bits  or  the  equivalent  of  one  alphanumeric  character),  and  -these  are 
aggregated  for  each  external  device  type  and  for  the  total  internal  memory  to 
arrive  at  the  system's  "capacity",  the  total  number  of  bytes  in  the  system. 

(2)  System  Power 

System  power  can  be  derived  if  the  rate  at  which  it  can 
manipulate  information  (bits  or  bytes)  between  the  various  capacity  components 
can  be  determined.  The  number  of  bytes,  or  amount  of  information,  systems  can 
manipulate  internally  and  between  peripherals  in  a  given  amount  of  time  is 
generally  a  known  quantity,  and  thus  can  be  used  as  the  metric.  By  collecting 
and  correlating  this  type  of  information,  one  can  begin  to  determine  the 
relative  power  of  different  systems  by  comparing  their  capacity  and  ability  to 
handle  data.  Cost  is  directly  correlated  with  power.  Scaling  systems  can  thus 
be  achieved  through  scaling  their  power  at  the  cost  of  not  being  able  to  store 
and  process  as  much  data  at  a  given  time  or  at  as  fast  a  rate. 

(3)  Hardware  Capacity 

Hardware  components  provide  metrics  by  which  they  may  be  measured 
and  compared.  Memory  size  is  a  good  example  as  is  the  speed  of  a  communications 
line.  A  one  megabyte  memory  module  scales  four  megabytes  by  75%  (the  resultant 
scale  factor  is  25%);  a  300  baud  modem  scales  a  3600  baud  one  by  92%  with  a 
resulting  scale  factor  of  8%. 

(4)  Software  Capacity 

Each  element  of  the  software  can,  again,  provide  its  own  metric, 
e.g.  table  size  can  be  scaled  by  reducing  the  number  of  bytes.  The  number 
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of  bytes  metric  would  also  apply  to  input  and  output  field  sizes. 

Robustness  of  a  system,  the  ability  to  handle  a  broad  spectrum  of  data  volumes 
in  excess  of  that  originally  anticipated,  can  be  scaled  by  implementing  a 
minimun  of  error  checking.  Hie  metric  would  be  number  of  error  conditions 
to  be  checked  in  the  system. 

Throughput  is  a  measure  of  the  system  s  efficiency  of  using 
resources.  Throughput  is  usually  a  function  of  the  operating  system  but  is  not 
restricted  to  such  and  it  is  generally  expressed  as  the  amount  of  work  processed 
in  a  certain  time  frame.  This  can  be  best  visualized  by  considering  a  batch- 
type  environment;  the  metric  would  be  defined  as;  Number  of  user  jobs  completed/ 
unit  time,  the  more  user  jobs  completed  in  a  given  amount  of  time  the  greater 
the  throughput.  Similarly,  the  more  job-steps  completed  in  a  given  amount  of 
time  the  greater  the  throughput.  In  addition,  input  data  rates  are  a  measure  of 
throughput.  When  throughput  is  related  to  capacity,  a  performance/efficiency 
index  is  obtained.  In  general,  throughput  is  a  function  of  a  large  number  of 
factors  including  the  percentage  of  time  that  a  system  is  operable.  They  are  all 
intimately  related  to  productivity  and,  when  combined,  yield  a  measure  of 
performance  efficiency.  Scaling  performance  has  considerable  potential  for  cost 
savings  because  realizing  high  efficiency  in  EDP  systems  tends  to  drive  costs 
exponentially  and  schedules  proportionally  higher. 

b.  Interactive  Responsiveness 

Interactive  responsiveness  was  defined  in  the  proposal  as  the  inverse 
of  response  time,  that  is,  the  number  of  responses/unit  time.  This  definition 
maintains  consistency  in  defining  parameters  so  that  their  "down"  direction 
implied  scaling  and  their  "up"  direction  implied  unsealing.  Responsiveness  is 
dependent  on  many  factors.  In  a  message  switching  system,  responsiveness  is 
dependent  on  such  parameters  as  message  control  efficiency,  communication  line 
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speeds,  and  the  number  of  message  transceivers  present;  in  an  information  based 
system,  it  is  dependent  upon  keyed-search  efficiency,  storage  unit  access 
times,  retrieval  speeds,  frequency  of  queries,  etc. 

It  should  be  mentioned  that  responsiveness  is  very  difficult  to  predict 
on  the  front-end  of  the  implementation  phase.  This  parameter  is  usually  quanti¬ 
fied  through  observation.  In  the  past,  response  times  were  typically  establish¬ 
ed  as  a  system  requirement.  If  the  resulting  system  did  not  meet  the  target, 
much  work  was  expended  to  modify  the  system  and  bring  the  response  time  within 
specifications.  Research  has  shown  that  this  is  consistently  the  most  costly 
way  to  effect  what  essentially  are  design  changes  -  on  the  tail-end  of  the 
development  cycle.  Scaled  system  development,  on  the  other  hand,  provides  a 
scaled  model  of  the  ultimate  system  which  would  conspicuously  reveal  such  design 
deficiencies  and  the  full-scale  system  design  specification  can  be  cost-effec¬ 
tively  adjusted  in  the  front  end  of  the  design  cycle  -  where  economic  leverage 
is  the  greatest.  Suppose  it  is  anticipated  that  a  system’s  responsiveness  will 
degrade  in  direct  proportion  to  the  number  of  terminals  connected  to  it.  If  a 
scaled  system  with  one-tenth  as  many  terminals  does  not  respond  in  less  than  the 
targeted  response  time,  it  should  be  clear  that  there  exists  a  deficiency  in  the 
design  specification.  Although  simplified,  this  is  probably  a  typical  analysis 
example  for  responsiveness. 

c.  Utilization 

The  effects  of  high  processor  utilization  on  costs  and  schedule  are 
fairly  well  documented;  above  approximately  50%  utilization,  costs  begin  to 
rise  exponentially  and  schedules  grow  proportionally  -  90%  utilization  will 
triple  the  costs  of  50%  utilization.  Such  figures  have  generally  been  derived 
from  studies  concerned  with  core  memory  and  processor  time  utilization. 


A 
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The  computation  of  utilization  is  fairly  straightforward,  as  the 


differences  can  be  attributed  to  the  entity  under  scrutiny.  Utilization  can  be 
measured  in  terms  of  the  percent  capacity  used;  alternatively,  it  can  be 
measured  according  to  the  time  used  as  compared  to  the  time  available.  If 
necessary,  the  utilization  of  each  component  of  the  system  can  be  measured.  For 
example,  CHJ  utilization  would  be  measured  in  instructions  per  time,  memory 
utilization  as  proportion  of  total  memory  available,  etc.  As  an  example  of  the 
calculation,  an  eighteen  megabyte  disk/drive  containing  nine  megabytes  of 
information  is  described  as  being  9/18,  or  50%  utilized.  Similarly,  if  a 
Management  Information  System  (MIS)  package  is  on-line  for  a  total  of  six  hours 
during  an  eight  hour  workday  due  to  user  demand,  its  utilization  may  be  computed 
as  6/8,  or  75%. 

d.  Operating  System/System  Organization 

Report  1.1  identified  subelements  of  Operating  System/System 
Organization  as  processing  mode,  operating  system,  and  interrupt  processing.  In 
that  report,  the  difficulty  in  quantifying  these  aspects  was  addressed.  Scaling 
system  aspects  applicable  under  this  category  would  undoubtedly  be  highly 
case-dependent  and  quantifying  the  factors  largely  subjective. 

(1)  Processing  Mode 

In  a  system  where  batch,  interactive,  and  real-time  processing 
modes  are  supported,  a  scaled  system  could  consider  only  a  single  mode  of 
operation.  Also,  a  real-time  system  could  be  scaled  with  a  batch  system  and 
simulated  input  data.  Measuring  the  resulting  decrease  in  complexity  (if 
measuring  could  be  done  at  all!)  would  appear  to  be  not  as  valid  as 
investigating  resultant  changes  in  other,  quantifiable  system  aspects  which 
interrelate  with  the  processing  mode. 
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(2)  (Operating  System 

As  noted  in  report  1.1,  if  the  full-scale  system  requires  a 
custom  operating  system,  the  scaled  system  could  use  an  off-the-shelf  system 
or  modify  an  existing  one.  The  resultant  cost  and  schedule  changes  can  be 
used  as  the  metric. 

3.  Functionality 

H.D.  Mills  states  that  the  basic  functions  (60-80%  of  the  processing) 
of  a  unit  of  software  are  usually  a  small  fraction  (20-40%)  of  the  total  soft¬ 
ware  finally  built.  This  assertion  is  generally  accepted  and  holds  deep  impli¬ 
cations  for  Seeded  Systems  Technology.  Since  effort  is  strongly  correlated  with 
produced  code,  an  initial  operating  capability  of  a  full-scale  system  (barring 
ancillary  functions,  documentation,  installation,  maintenance,  and  user  support) 
could  be  achieved  with  only  20-40%  of  the  total  projected  effort.  This  at¬ 
tests  to  the  viability  of  the  scaled  systems  approach  and  identifies  functional¬ 
ity  as  a  principal  system  aspect  suitable  for  scaling.  Functionality  can  be 
scaled  by  reducing  the  variety  of  functions  supported  (eliminating  ancillary 
or  additional  support  functions) ,  or  by  reducing  functional  complexity.  The 
first  method  entails  vertical  functional  scaling  (eliminating  sub-systems);  the 
second  -  horizontal  functional  scaling. 

a.  Modularity 

When  speaking  in  terms  of  functionality,  modularity  is  probably  the 
system  parameter  that  is  being  most  directly  dealt  with.  Modularity  describes 
the  nunber  and  composition  of  the  various  program  modules  comprising  a  system. 
The  complexity  metric  of  a  system  can  be  defined  in  a  manner  analogous  to  that 
used  for  defining  the  complexity  of  a  hierarchical  structure. 
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Absolute  complexity  =  number  of  modules 

Relative  complexity  *  number  of  module  linkages 

number  of  modules 

b.  Factoring  Vertical  Functional  Scaling 

In  the  case  of  vertical  scaling,  scale  factors  could  be  computed  based 
solely  upon  the  number  of  functions  eliminated  as  compared  to  the  total  number 
of  functions  called  for  in  the  requirements  or  design  specification  (3  of  12 
functions  eliminated  reduces  functionality  by  3/12,  or  25%;  the  scale  factor 
would  subsequently  be  computed  as:  (12-3J/12,  or  75%). 

Preferably,  the  amount  of  code  necessary  to  support  each  function  would 
be  a  known  quantity.  Thus,  if  the  three  functions  discussed  in  the  previous 
example  required  40,000  lines  of  code  (LOC)  from  a  total  system  size  of  100,000 
LOC,  the  resultant  scale  factor  would  be  60%  ((100, 000-40, 000 )/100, 000),  as  op¬ 
posed  to  75%. 

The  absolute  complexity  could  be  scaled  by  reducing  the  number  of 
modules.  A  more  accurate  metric  for  measuring  the  scale  factor  might  be 
number  of  lines  of  code. 

c.  Factoring  Horizontal  Functional  Scaling 

Deriving  a  scale  factor  for  horizontal  scaling  may  be  more  difficult. 
In  the  case  of  eliminating  a  common  shared  functional  module,  such  as  a  monitor 
or  security  subsystem,  the  analysis  could  be  analogous  to  that  of  vertical 
scaling.  If,  however,  horizontal  scaling  is  achieved  by  reducing  module  sizes 
due  to  decreased  complexity,  the  analysis  may  have  to  be  more  subjective.  As  in 
the  case  of  operating  system/system  organization,  computation  of  horizontal 
scale  factoring  will  be  reserved  for  a  case-by-case  analysis  and  future 


research. 


For  a  message  handling  system,  the  receipt/transmission  can  be  scaled 
by  omitting  some  of  the  functions.  For  example,  a  full-scale  system  might  have 
message  receipt,  transmission,  dissemination,  storage,  and  retrieval  capabil¬ 
ities  while  the  scaled  system  might  receive  messages  from  only  one  input 
source,  not  transmit  messages,  etc.  Hie  scale  factors  for  these  functions 
would  be  defined  as  the  percentage  of  full-scale  functionality  implemented 
by  the  scaled  system. 

For  message-receiving  systems,  the  number  of  networks  with  which  the 
system  interfaces  could  determine  the  weight  factor  in  the  metric: 

Scaling  message  receipt  factor  = 

number  of  network  interfaces  in  scaled  system 
number  of  network  interfaces  in  full-scale  system 

The  functions  of  message  transmission,  e.g.  handling  new  messages  and 
retransmission,  imply  a  weight  assignment  in  this  case  of  two  to  the  full- 
scale  system. 

(Transmission,  dissemination)* 

number  of  {transmission,  dissemination}  functions  in  scaled  system 
number  of  {transmission,  dissemination}  functions  in  full-scale  system 

The  weights  of  the  full-scale  system  and  those  of  the  scaled  system 
factors  can  be  summed  to  produce  a  total  functionality  scale  factor: 

Functionality  scale  factor  =  Zweighcs  in  scaled  system 

Z weight's  in  full-scale  system 

As  more  detail  about  the  full-scale  system  design  is  acquired,  it 
will  be  possible  to  further  refine  these  metrics,  by  assigning  weights  to 
the  proposed  subfunctions  that  reflect  the  complexity  or  resource  requirements 
for  implementation.  At  this  level  of  detail,  typical  metrics  used  previously 
include  source  lines  of  code  estimates,  staff  estimates,  and  number  of  pages 
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devoted  to  each  function  within  the  functional  description. 

4.  Security 

Currently,  system  security  is  a  topic  of  prime  consideration  and  vigorous 
research.  As  more  information  i^  entrusted  to  computer  systems,  concern  for  se¬ 
curity  necessarily  increases.  This  concern  has  been  manifested  in  such  areas  as 
automatic  encryption  technology  and  operating  systems  through  the  Kernel ized  Se¬ 
cure  Operating  System.  To  date,  many  security  strategies  have  been  implemented 
including  the  use  of  multiple  processor  networks  to  distribute  varying  levels  of 
classified  material  and  to  ensure  that  the  determinancy  of  access  privileges  can 
be  maximized. 

Barring  automatic  cyphering/decyphering  hardware  for  communications,  much 
of  computer  security  is  achieved  through  overhead  software.  This  may  be  accom¬ 
plished  at  any  one  of  the  many  system  levels:  kernel,  executive,  (operating) 
system,  sub-system,  and  application.  Since  few  operating  systems  are  built  with 
the  goal  of  providing  information  processing  with  multiple  security  levels,  most 
security  schemes  are  implemented  at  the  sub-system  level  or  below.  Regardless 
of  system  level,  security  processing  primarily  involves  the  validation  of  re¬ 
source  requests  against  tables  cross-referencing  valid  requestors  (users  and 
programs)  and  resources.  Such  tables  require  maintenance  modules  to  keep  them 
up  to  date  as  well  as  access  and  search  modules  (which  must  also  be  secure!). 
Depending  on  the  degree  of  security  provided,  these  tables  grow  increasingly 
complex  and  their  associated  processing  overhead  grows;  thus,  comparing  the  mag¬ 
nitudes  of  such  tables  provides  an  adequate  means  of  quantifying  security 
metrics  in  addition  to  the  obvious  metrics  of  required  design  and  coding 
effort.  Additionally,  many  systems  identify  the  need  for  access  audit  modules 
which  track  details  concerning  requests  for  resources  and  the  security  modules’ 
resulting  actions. 
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The  security  probability  (Gilb)  is  defined  as 


P  (a)  *  probability  of  successful  attack  rejection 
This  probability  will  vary  depending  on  the  level  of  protection  in  effect, 
a.  File  protection 


Six  levels  of  file  protection  have  been  defined  by  Randall  Jensen  in 


Software  Engineering,  as  follows,  with  a  scale  value  attached  to  each: 


Scale  Value 


Levels  Of  File  Protection 


1 


no  protection-  File  access  and  all 
operations  available  to  any  user 


2  all  or  nothing-  If  access  granted, 

tnen  all "operations  permitted 

3  controlled  sharing-  User  is  granted 

access  rights  which  are  the  minimum  necessary 
to  accomplish  the  specified  task 

4  specified  access-  Access  to  each  object 

is  restricte3  and  access  rights  are  owner - 
definable;  the  rights  could  be  based  on  the 
user's  identity,  the  environment  (type  of 
terminal,  time  of  day,  etc.),  or  the  contents 
of  a  record 


5 


total  protection-  No  file  sharing  at 

m — - 


6  post-access  control-  Users  granted  access 

subject  to  tbe  purpose  for  which  it  is  to 
be  used  after  access  accompl ished 


b.  Dimensions  of  access  matrix 


The  dimensions  of  the  matrix  are  determined  by  the  number  of  users, 
processes,  or  procedures  which  have  access  restrictions  and  in  the  other 
direction  by  the  number  of  objects  for  which  access  is  restricted.  These  two 
factors  to  be  scaled  would  have  metrics  as  follows: 


number  of  users  with  access  restrictions 
total  number  of  "user s 

number  of  procedures  with  access  restrictions 
total  number  oT  procedures 


c.  Number  of  data  sets 

Different  data  sets  could  be  provided  for  each  access  clearance. 

d.  Classification  Level  of  users  and/or  terminals 

To  scale  classification  level,  all  terminals/users  could  be  granted 
access  to  everything  and  provide  only  physical  security  for  access  to  the 
terminals.  Th*»  estimated  amount  of  effort  necessary  to  implement  a  subsystem 
(LOC)  which  would  allow  other  than  open  access  would  be  used  as  the  metric. 

e.  Granularity  of  data  access  control 

Access  to  different  user  classes  could  be  modified. 

f .  Codewords 

Codewords  could  be  eliminated,  with  the  metrics  as  follows: 

0  =  No  Codewords 
1  =  Codewords 

g.  Audit  Trail 

A  similar  metric  could  be  defined  as  follows: 

0  =  No  audit  trail 
1  =  Audit  trail 

h.  Authentication  approach 

Each  function,  including  the  use  of  passwords,  the  recording  of  access 
failures,  log-on  procedures,  and  terminal  authentication,  would  be  assigned  a 
weight  determined  by  the  amount  of  software  (lines  of  code)  needed  to  implement 
it.  The  scaling  resulting  from  the  simplif ication  or  elimination  of  these 
capabilities  could  be  measured  by  the  percentage  of  lines  of  code  eliminated. 

i.  Encryption 

Through  the  use  of  simulated  data  or  limited  transmission,  a 
full-scale  requirement  for  software  encryption  of  data  could  be  suspended. 

The  metric  would  be: 
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0  ■  No  encryption 
1  *  Encryption 

As  more  detail  about  the  proposed  system  is  acquired,  the  weights  of  {0,1} 
could  be  adjusted  to  indicate  in  some  way  their  complexity  or  resource  require¬ 
ments  for  implementation.  However,  for  simplicity,  the  present  metrics  will 
only  indicate  whether  or  not  the  function  is  implemented. 

5.  Maintainability 

Building- in  maintainability  is  generally  done  to  minimize  the  time  required 
to  locate  and  fix  a  bug  in  the  software  during  the  test,  integration,  and 
maintenance  phases  of  its  life-cycle.  This  time  will  inevitably  be  directly 
proportional  to  the  amount  and  quality  of  supporting  technical  documentation 
available.  Of  course,  complementary  aspects  of  maintainability  are  the 
auto-correcting,  recovery,  or  diagnostic  facilities  supplied  with  the  final 
software  product.  Scaling  such  aspects  of  maintainability  as  amount  of 
documentation  and  maintenance  aids  supplied  may  seem  contrary  to  sound 
developmental  practices  but  it  must  be  remembered  that  the  anticipated 
life-cycle  of  a  scaled  system  is  short  (just  long  enough  to  get  an  effective 
"handle"  on  the  design  and  functionality  of  the  full-scale  system);  the  scaling 
of  these  aspects  can  therefore  be  justified. 

Hie  appropriate  metrics  suitable  for  factoring  maintainability  are  amount 
of  documentation  produced  and  functionality  of  the  ancillary  maintenance  modules 
(the  auto-correcting,  recovery,  and  diagnostic  software  -  See  "Functionality"). 
In  addition,  the  effort  required  for  configuration  management  may  be  scaled  in 
the  respect  that  the  configuration  management  necessary  for  the  scaled  effort 
need  not  be  as  extensive  as  that  of  a  full-scale  system.  Again,  consideration 
of  the  expected  life-cycle  duration  is  paramount. 

Maintainability  is  defined  as  the  probability  that,  when  maintenance 


C-24 


action  is  initiated  under  stated  conditions,  a  failed  system  will  be  restored 
to  operable  condition  within  a  specified  time  t. 

Maintainability  is  a  function  of  the  capabilities  included  in  the 
system,  the  skill  level  of  the  personnel,  and  the  support  facilities 
(locally  available  tools  and  diagnostic  test  equipment  or  aids,  spare  parts/ 
alternative  program  versions/back-up  files).  It  is  a  measure  of  the  cost 
and  time  required  to  fix  software  errors  in  an  operational  system.  Among 
the  maintenance  nodules  which  could  be  scaled  are: 

a.  Process  error  handling 

The  scaling  involved  in  minimizing  the  number  of  conditions  to  be 
checked  can  be  measured  by  the  number  of  lines  of  code  needed  to  implement  the 
error  checking. 

b.  Restar t/recovery  procedures 

The  restart  procedures  can  be  eliminated  as  much  as  possible  and 
a  set  of  values  assigned  as  follows: 

0  =  no  restart  procedures 

1  =  minimal  restart  procedures 

2  =  complete  restart  procedures 

c.  Data  correction 

Modules  to  correct  and/or  reject  bad  data  can  be  eliminated  with 
the  scaling  being  measured  by  the  reduction  in  the  number  of  lines  of  code. 

d.  Fault  detection 

Fault  location/trap  software  can  be  eliminated,  reduced,  or 
modified,  again  using  the  number  of  lines  of  code  as  the  metric. 

e.  Monitors 

Software  to  monitor  system  performance  and  gather  statistics  can 
be  eliminated  and  the  (estimated)  number  of  lines  of  code  needed  to  implement 
these  functions  can  be  used  as  the  metric. 
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f .  Backup 

Backup  procedures  can  be  minimized. 

g.  Development  aids 

Development  aids  such  as  program  tracers  and  interactive 
debuggers  would  actually  be  added,  to  reduce  the  development  effort  of  the  full 
scale  system. 


h.  Documentation 

Documentation  objectives  are  concerned  with  the  quality  and 
quantity  of  user  publications.  Scaling  the  amount  of  documentation  may  be 
risky  as  pointed  out  in  1.1,  because  the  lack  of  documentation,  recovery,  and 
reconfiguration  programs  may  actually  hamper  rather  than  enhance  the  program. 

6.  Reliability 

Reliable  software  is  software  that  does  not  fail.  The  metric  commonly 
used  for  reliability  is  the  frequency  of  failures  occurring  over  a  specific  per 
iod  of  time.  Obviously  "building-in"  high  reliability  is  costly  and  is  only 
justified  in  applications  demanding  infallible  software,  such  as  man-rated  ap>- 
plications  (i.e.  applications  where  lives  are  at  stake).  One  software  aspect 
reflecting  upon  reliability  is  robustness.  Robustness  describes  the  software's 
ability  to  adequately  accommodate  erroneous  input  values;  values  which,  if  un¬ 
detected,  could  cause  the  software  to  produce  inappropriate  output,  fail,  or 
"crash".  The  pitfall  of  error  detection  is  the  ability  to  anticipate  all  input 
combinations  which  could  cause  the  software  to  fail.  This  requires  rigorous 
requirements  formulation  and  design;  reliability  cannot  be  "tested"  into  soft¬ 
ware.  As  such,  reliability  is  a  worthy  area  for  scaling  and  again  the  scaled 
system  approach  presents  the  opportunity  to  validate  and  refine  the  typically 
heuristic  systems  formulated  for  error  detection  and  correction. 


The  ooncept  of  reliability  is  contrasted  to  tnat  of  maintainability  in 


that  maintainability  is  concerned  with  readily  fixing  or  enhancing  the  software 
whereas  the  trust  of  reliability  is  to  prevent  failures  from  occurring  in  the 
first  place.  Accordingly,  a  certain  amount  of  redundancy  is  built  into  systems 
such  that  automatic  diagnosis  and  recovery  can  be  accomplished  by  the  software 
itself  without  operator  attention  or  intervention.  Such  methodologies  are  a 
principal  component  of  data  base  management  systems  where  the  need  for  the  abil¬ 
ity  to  detect  a  degraded  data  structure  and  rebuild  it  are  crucial  to  their  re¬ 
liable  operation.  This  redundancy  requires  additional  design,  system  storage, 
programming,  and  effort;  and  as  such  reliability  may  be  scaled  with  respect  to 
these  aspects. 

Reliability  can  be  defined  as  probability  of  satisfactory  performance  for  a 
given  time  when  used  under  stated  conditions,  the  metric  being  defined  as  the 
number  of  failures/time.  A  software  error  is  present  when  the  software  does  not 
do  what  the  user  reasonably  expects  it  to  do.  A  software  failure  is  an 
occurrence  of  a  software  error. 

a.  Precision 

The  precision  metric  is  defined  as  the  number  of  decimal  places  or 
bit^,,  whichever  is  the  most  convenient  unit  to  use  for  the  particular  appli¬ 
cation. 

Data  error  detection 

Software  geared  to  errors  which  would  appear  infrequently  in  practice 
or  not  at  all  in  the  input  to  the  scaled  system  can  be  eliminated. 

The  metric  will  be  defined  as  the  number  of  lines  of  code  for  error 

detection. 

c.  Approximation  Algorithms 


Scaling  can  be  accomplished  through  the  use  of  fast,  easy  (not  as 
accurate  as  possible)  approximation  functions  and  algorithms. 

The  motivation  for  scaling  approximation  algorithms  is  to  minimize 
lines  of  oode  or  complex  operations  which  are  prone  to  error. 

The  lines  of  code  required  to  implement  algorithms  could  be  the 
metric.  The  logical  complexity  of  a  program,  a  measure  of  the  degree  of 
decision-making  within  a  system,  could  also  be  used.  The  absolute  logical 
complexity  measure  is  defined  as  the  number  of  non-normal  exits  from  a  decision 
statement  (IF,CN,AT  END,  etc).  The  relative  logical  complexity  is  defined  as: 

Absolute  Logical  Complexity 
To tal  number  of  Inst rue t ions 

To  minimize  complexity,  maximize  the  independence  of  each 
component  of  a  system. 

d.  Coding  Standards 

Relaxation  in  enforcement  of  coding  standards  would  only  be  done  in 
cases  where  recoding  would  be  necessary  to  implement  the  full  system.  If, 
however,  the  scaled  system  will  form  the  basic  structure  for  the  full  system, 
then  strict  coding  standards  should  be  maintained. 

7.  Programming  Language 

The  important  principles  in  language  syntax  and  semantics  are  uniformity, 
i.e.  a  language  construct  that  appears  in  several  contexts  should  have  the  same 
syntax  and  semantics  and  simplicity,  which  implies  clarity  and  integrity  of 
language  concepts. 

More  often  than  not,  the  choice  of  a  programming  language  is  set  or,  at 
best,  limited  at  any  one  development  installation.  Selection  of  an  alternate 
language  can  be  prompted  by  a  number  of  reasons.  These  include  non-existence 
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or  limited  support  of  the  target  machine  or  language  and  development  complexity 
of  the  target  language. 

Consider  the  case  of  "ADA",  the  proposed  DoD  standard  programming  lan¬ 
guage  which,  at  the  time  of  this  writing,  has  been  specified  but  not  yet  fully 
implemented.  A  software  development  installation  could  still  begin  work  on  a 
project  targeted  for  ADA-language  implementation  through  the  use  of  an  existing 
high-level  language,  designing  and  coding  it  with  the  anticipation  and  intention 
of  future  conversion  to  AE&.  While  it  may  be  difficult  to  visualize  the  "scal¬ 
ing"  in  this  example,  it  represents  the  use  of  software  other  than  that  targeted 
so  that  a  preliminary  product  can  be  readily  assembled  and  evaluated  for  any  de¬ 
sign  or  operational  deficiencies  with  the  intent  of  minimizing  the  overall  de¬ 
velopment  schedule,  risk,  and  cost. 

In  contrast  to  no  language,  there  may  be  no  machine  available  for  the  de¬ 
velopment  of  a  software  application.  Ihis  case  is  not  infrequent,  as  software 
projects  are  often  started  in  anticipation  of  the  delivery  of  hardware  (which  is 
invariably  delivered  late),  or  the  production  of  hardware  which  is  not  yet  mar¬ 
keted  but  whose  characteristics  have  been  fully  specified.  In  these  cases,  the 
software  project  need  not  be  delayed,  as  the  tools  of  cross-assemblers,  compil¬ 
ers,  hardware  simulators  and  emulators  can  be  utilized  so  that  the  scaled  pro¬ 
duction,  evaluation,  and  design  iteration  can  get  underway. 

The  writing  of  software,  much  like  any  other  creative  process,  is  largely 
an  iterative  process  involving  the  refinement  of  working  "drafts”  toward  the 
goal  of  a  product  in  final  form.  Some  languages  facilitate  this  type  of  proce  s 
more  readily  than  others  even  though  they  may  not  be  the  best  choice  for  the  ul¬ 
timate  implementation.  A  perfect  example  of  this  is  that  of  the  language  inter¬ 
preter.  Typically  (and  necessarily)  slow  and  inefficient  in  terms  of  execution 
speed  and  run-time  hardware  requirements,  language  interpreters  are  interactive 
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and  promote  and  facilitate  quick  implementation  of  anything  from  an  application 
program  to  an  operating  system  through  built-in  type  checking,  syntax 
analyzers,  statement  editors,  and  br eak -point ing. 

"Structured"  languages  are  also  well  suited  to  subsequent  modification  and 
semantical  re-working  while  languages  resembling  assembler  dialects  are  more 
difficult  to  read  and  comprehend  and  thus  harder  to  use  in  iterating  towards  a 
software  solution.  This  is  one  of  the  underlying  aspects  of  structured,  high- 
level  languages  and  part  of  the  reason  they  contribute  to  shorter  development 
schedules  and  increased  programmer  productivity.  An  assembly  language-based  ap~ 
plication  can  be  scaled  with  respect  to  programming  language  through  the  choice 
of  a  high-level  language  to  work  out  the  basic  logic  of  the  application  in  a 
structured  manner.  After  design  validation,  the  chore  of  language  conversion  to 
assembler  for  code  optimization  is  relatively  straightforward.  This  is 
analogous  to  arguments  presented  in  favor  of  simulation  languages,  which  have 
been  used  quite  successfully  in  many  different  instances.  Such  a  methodology 
would  be  ideal  for  the  development  of  software  intended  for  embedded 
applications  and  is,  in  fact,  a  common  practice  in  the  development  of  such 
software  as  avionics  and  hand-held  devices  such  as  programmable  calcualators  and 
language  translators. 

Report  1.1  cited  other  instances  for  scaling  language  selection  and  im¬ 
plementation.  Scaling  language  implementation  is  accomplished  by  successively 
enhancing  a  base-line  subset  of  the  language  being  implemented  -  an  iterative 
enhancement  technique  which  is  similar  to  the  scaled  approach. 

Even  though  scaling  programming  language  is  feasible,  the  factoring  of 
this  aspect  is  difficult;  much  research,  however,  has  been  devoted  to  quanti¬ 
fying  the  relative  expressive  powers  of  languages.  Perhaps  the  best  known  work 
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of  this  type  is  that  of  Halstead's  Software  Science.  Through  the  basic  tools 
of  software  science,  Halstead  was  able  to  develop  a  methodology  for  factoring 
the  expressive  power  of  languages  on  a  scale.  Further  discussion  of  Halstead's 
work  here  would  be  a  digression,  the  point  being  that  programming  languages  have 
been  analyzed  and  assigned  ratings  as  to  their  relative  "power''.  In  the  con¬ 
text  of  scaled  systems,  such  a  rating  could  be  used  to  imply  a  measured  impact 
on  development  effort  of  typical  applications.  Additional  data  is  available 
quantifying  the  expressive  power  of  languages  at  the  machine  level,  this  being 
the  expansion  ratio  of  machine  instructions  to  high-level  language  statements. 
Halstead,  Knuth,  and  others  have  made  contributions  in  this  area. 

8.  Hardware  Configuration 

As  in  the  case  of  data  base,  factoring  hardware  configuration  is  simpli¬ 
fied  by  the  nature  of  the  entity  itself,  due  to  the  numerically  descriptive 
nature  of  hardware.  Hardware  is  basically  described  oy  its  capacity,  transfer 
rate,  quantity,  and  cost,  where  the  basic  scale  factor  definition  would  be: 

Scale  factor  (in  %)  =  value  (metric)  for  scaled  version 

value  (metric)  for  full-scale  version 

Consider  the  following  list  of  hardware  elements  possible  for  scaling  and 
their  metrics: 

a.  Number  of  CRJ'S 

Scale  from  multiprocessing  to  a  single  processor 

Processing  scale  factor  = 

number  of  CPU* s  in  scaled  version 
number  of  CRJ  s  in  fuTT-scal e  version 

b.  Number  of  Peripherals 

Peripheral  scale  factor  = 

number  of  peripherals  in  scaled  version 
number  of’ per lpherals  in  full-scale  version 
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c. 


Instruction  set  of  a  CPU 


Instruction  set  scale  factor  * 

number  of  elements  in  scaled  instruction  set 
number  of  elements  in  full-scale  instruction  set 

It  must  be  noted,  however,  that  some  devices  serve  to  reduce  system 
complexity  by  their  presence.  Examples  would  include  intelligent  terminals  and 
peripheral  controllers  or  1/0  processors. 

d.  For  some  factors,  simulation  could  be  used  to  reduce  complexity  in  tne 
scaled  system.  For  example,  eliminate  real-time  interrupts  by  eliminating  the 
input  devices  (simulate  the  data  instead) . 

e.  Regarding  communications  between  the  processor  and  peripherals,  the 
number  of  communications  nodes  could  be  easily  factored  by  the  standard 
definition. 

Communication  node  scale  factor  = 

number  of  communication  nodes  in  scaled  version 
numBer~oT  communications  nodes '  lnTuTl -seal e  ver sion 

f.  Complexity  of  communications  network  or  hierarchy 

The  complexity  can  be  scaled  by  lowering  the  number  of  linkages 
among  nodes. 

g.  Level  of  service  to  peripherals 

A  scaled  system  could  provide  an  equal  level  of  service  to  each  node 
node  rather  than  prioritizing  service.  The  scaling  would  be  based  on  the 
complexity  associated  with  prioritized  service. 
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Parameter 


Metric 


1 .  Data  Base 

a.  Complexity  of  access  method 

b.  Complexity  of  data  structure 

(1)  relative 


(2)  absolute 

c.  Size  elements 


2.  Performance 

a.  Productivity/throughput 

(1)  System  capacity 

(2)  System  power 

(3)  Hardware  capacity 

(4)  Software  capacity 


b.  Interactive  responsiveness 

c.  Utilization 


d.  O.S. /Organization 

(1)  Processing  mode 

(2)  Operating  system 

3.  Functionality 

a.  Modularity 


b.  Vertical  subsystem  scaling 

c.  Horizontal  functional  scaling 


4.  Security 

a.  File  protection 

b.  Dimensions  of  access  matrix 


c.  Number  of  data  sets 

d.  Classification  level  of 
users  and/or  terminals 


cost  metric 
1 inks/node 

R 

for  relational,  R+T,  R*numL>er  of  rows, 
T=number  of  tables 
number  of  levels,  nanber  of  nodes 
number  of  files,  length  of  files 
(bytes),  length  of  records,  nunber  of 
fields,  length  of  data  fields, 

(bytes) 


total  number  of  bytes  in  system 
number  of  bytes/time 
Ho£  capacities  of  individual 
components 

number  of  bytes  in  tables,  etc; 
number  of  error  conditions  to  be 
checked  in  the  system 
number  of  responses/unit  time 
capacity  used/capacity 
available,  time  used/time 
available 

number  of  modes  of  operation 
cost  and  schedule  changes 


absolute  complexity  =  number  of 
modules 

relative  complexity  =  number  of 
module  linkages 

number  of  lines  of  code 

DOC,  staff  estimates,  number  of  pages 

of  functional  description 


levels  of  file  protection 
{1-6} 

number  of  users  with  access 
rescrictions/total  number  of  users 
number  of  procedures  with  access 
restrictions/total  number  of  pro¬ 
cedures 


DOC  for  a  classification. subsystem 
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Parameters 


Metric 


g- 

Granularity  of  data  access 
control 

Minimize  for  each  level  of  access 

f. 

Codewords 

0  =  no  code  words 

1  =  codewords 

g- 

Audit  trail 

0  =  no  audit  trail,  1  *  audit  trail 

h. 

Authentication 

LOC 

i . 

Encryption 

0  =  no  encryption,  1  =  encryption 

Maintainability 

a. 

Process-error  handling 

number  of  conditions  to  be  checked 

b. 

Restart/recovery 

LOC 

c. 

Data  correction 

LOC 

d. 

Fault  detection 

IOC 

e. 

Monitors 

LOC 

f. 

Backup 

g- 

Development  aids 

h. 

Documentation 

number  of  pages 

Reliability 

a. 

Precision 

number  of  decimal  places  or  bits 

b. 

Data  error  detection 

LOC  for  error  detection 

c. 

Approximation  algorithms 

IOC  required  to  implement 
algorithms,  absolute  and 
relative  logical  complexity 

d. 

Coding  standard  enforcement 

7.  Programming  Language  "power" 


8.  Hardware  Configuration 

a.  Number  and  complexity 
of  hardware 


b.  Interrupts 

c.  Complexity  of  comriunications 

d.  Level  of  service  to 
peripherals 


number  of  CPU's,  number  of 
peripherals, 

number  of  elements  in  instruction 
set 

number  of  communications  nodes 
number  of  linkages  among  nodes 
network 

complexity  of  prioritized 
service 
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INTERRELATIONSHIPS  AMONG  SCALING  FACTORS 


The  scale  factors  proposed  in  Report  1.2,  System  Scale  Factor 
Met  r ics ,  influence  and  interrelate  with  each  other  in  complex  ways  that 
can  be  quite  different  for  different  operating  regimes  (e.g., 
disk-limited,  CPU-limited)  of  the  IDHS  being  modeled.  In  order  to  scale 
a  system,  ways  are  needed  of  predicting  changes  in  system  characteristics 
as  the  scaled  parameters  vary,  even  when  the  variations  are  large  enough 
to  place  the  IDHS  into  a  different  operat ing  region.  For  example,  if  the 
small  scale  system  has  a  factor  of  four  fewer  terminals  than  the 
envisioned  full-scale  system,  it  is  necessary  for  the  system  designer  to 
know  how  system  throughput  will  degrade  when  the  system  is  scaled  up  ana 
terminals  are  added. 

In  many  engineering  applications,  the  amount  by  which  critical 
parameters  vary  is  in  some  sense  "small",  and  it  is  possible  to  represent 
the  rel at ionsh ips  as  linearized  expansions  about  some  nominal  operating 
point  .  Unfortunately,  the  kind  of  scaling  that  is  appropriate  in  the 
present  application  is  generally  characterized  by  variations  ranging  from 
a  factor  of  2  to  10.  IDHS,  when  scaled  by  these  magnitudes,  will  often 
be  operating  in  entirely  different  regimes,  and  no  simple  expressions 
relating  the  performance  characteristics  in  different  regimes  can 
generally  be  constructed. 

As  an  il 1 ust rat  ion ,  consider  the  functional  relationship  of  system 
throughput  to  a  scaled  parameter  such  as  CPU  power  for  an  IDHS  operating 
in  a  disk-limited  or  a  CPU-limited  operating  regime. 
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Consider  Q,  the  ratio  of  the  length  of  time  the  CPU  is  occupied  to 


the  length  of  time  the  disk  is  occupied: 

Q  ■  CPU  inst  ruct ions  *  second  s 

disk  access  CPU  instruction 

m  seconds  CPU  occupied 

seconds  seconds  disk  occupied 

disk  access 

Where  CPU  inst  ruct ions  is  the  CPU  power, 
second 

When  Q  is  less  than  one,  the  system  is  disk-limited.  Figure  1 
demonstrates  how  such  a  disk-limited  system  might  look  with  two  jobs 
running,  with  control  of  the  CPU  and  disk  alternating  over  time.  The 
jobs  generally  finish  using  the  CPU  quickly  and  must  wait  for  the  slower 
d  isk . 


When  the  system  is  disk-limited  it  tends  to  be  rather  insensitive  to 
CPU  speed,  but  throughput  varies  greatly  with  changes  in  disk  access  time 
and  with  those  software  changes,  e.g.,  in  data  base  organization,  that 
vary  the  CPU  instructions  executed  per  disk  access.  That  is,  the 
behavior  of  a  disk-limited  system  (most  jobs  are  in  the  disk  queue)  is 


sens  it  ive  to  : 


o  Disk  hardware  characteristics  (e.g.,  speed,  size) 
o  Data  base  organization  affecting  disk  accesses 
per  search 

o  Security  features  that  require  disk  accesses 
o  Available  main  memory  where  this  influences 
paging  and/or  swapping  rates 


The  behavior  is 


insens  it ive  to  : 


o  CPU  power 

o  Software  changes  that  affect  the  number  of 
computational  instructions 
o  Data  base  organization  that  doesn't  affect 

disk  accesses  (e.g.,  file  size  in  a  random 
access  configurat ion) 
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TIME 

JOB 

1  STATUS 

JOB 

2  STATUS 

Job 

1 

get  s  CPU 

Job 

2 

in  CPU  queue 

Job 

1 

get  s  disk 

Job 

2 

get  s  CPU 

Job 

1 

continues  using  disk 

Job 

2 

in  disk  queue 

Job 

1 

get  s  CPU 

Job 

2 

gets  disk 

r 

Job 

1 

in  disk  queue 

Job 

2 

continues  using  disk 

Figure  1.  Job  Behavior  in  a  Disk-limited  System 
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On  the  other  hand,  if  the  quantity  Q  is  larger  than  one,  as 
illustrated  in  Figure  2,  most  jobs  end  up  waiting  for  CPU  services  and 
the  system  is  more  sensitive  to  changes  in  CPU  power  and  those  parameters 
that  affect  the  number  of  instructions. 

The  behavior  of  a  CPU-limited  system  (most  jobs  are  in  the  CPU 
queue)  is  sensitive  to: 

o  Changes  in  CPU  power 

o  Software  changes  that  affect  the  number  of 

comput  at ional  inst  ruct ions . 

The  behavior  is  insensitive  to: 

o  Speed,  size  of  disk  hardware 
o  Overhead  features  such  as  security, 
that  require  extra  disk  accesses  to 
perform  specific  functions 

o  Data  base  organization  affecting  disk  accesses/search 

o  Available  main  memory 

The  behavior  described  in  these  examples  of  CPU-limited  and 
disk-limited  systems  is  summarized  in  Figure  3,  which  illustrates  the 
throughput  and  CPU  speed  functional  relationship.  It  shows,  for  example, 
that  doubling  CPU  power  does  not  necessarily  double  throughput. 

The  relat ionsh  ps  between  parameters  are  complex  and  non-linear.  It 
is  not  possible  t  write  down  analytic  expressions  that  will  hold  under 
al 1  cond it  ions . 

In  order  to  provide  the  system  designer  with  the  tools  that  will 
enable  him  to  predict  performance  under  the  wide  range  of  scaling 
conditions  that  are  encountered  in  practical  s  it  uat  ions ,  a  concept  has 
bean  evolved  that  uses  a  simulation  model  of  a  generalized  intelligence 
data  handling  system  to  predict  performance  and  to  predict  changes  in  one 
variable  from  changes  in  another.  In  effect  then,  the  simulation 
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Job 
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in  CPU  queue 

Figure  2.  Job  Behavior  in  a  CPU-limited  System 
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substitutes  for  the  nonexistence  of  precise  analytic  functional 
relationships  between  various  scaled  paramtt  er  s  . 


It  has  been  found  that  a  fairly  small  number  of  parameters  is 
adequate  to  specify  each  part icular  IDHS  to  the  simulation.  Each  of  the 
input  parameters,  in  turn,  can  be  expressed  as  a  fairly  simple  analytic 
function  of  the  scaling  parameter  factors.  A  series  of  formulae  are  used 
in  steps  to  relate  the  simulator  variables  to  scale  factors.  A  diagram 
of  the  technique  is  shown  in  Figure  4.  An  example  of  a  simulator  input 
variable  is  CPU  service  time,  i.e.,  time  in  CPU  per  CPU  block,  where  a 
block  is  a  set  of  instructions  until  a  disk  access  is  encountered.  The 
following  formulae  are  one  set  that  can  be  used  to  relate  CPU  service 
time  to  system  scale  factors. 

CPU  service  time  * 


instructions  executed  per  block 
power  (instructions  per  time) 


Total  instructions  executed  *  number  of  computational 
♦  number  of  disk  accesses  * 


security  i 
acces 


instructions  other 

_  ♦  overhead 

ss  instruct 


d 

t  ions 


inst  ruct ions 


Number  of  disk  accesses  «  number  of  data  base  accesses  ♦ 

number  of  paging  accesses 


Number  of  paging  accesses 


Kl*  number  of  instructions  executed* 
virtual  core  per  job 
real  core  per  job 


Real  core/ job 
hardware  core 


operating  system  core  -  security  core-maintenance  core 
number  of  terminals 


The  above  step-by-step  procedure  to  relate  simulator  variables  to 


scaling  parameters,  such  as  number  of  terminals  and  security  and 
maintenance  core  (as  functions  of  the  levels  of  protection  and 
maintainability  required),  is  illustrated  in  Figure  5  for  CPU  and  disk 


Basic  Scalec 
Vanabl  es 


! 

j 

i 


Figure  4.  Relating  Scale  Factors  to  Simulation  Variables 
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serv  ice  t  ime . 

Other  parameters  of  which  throughput  is  a  function  include  disk 
service  time,  job  rate  (number  of  jobs  input  per  time),  number  CPU 
requests  per  job,  system  power,  and  system  capacity. 

Several  of  the  elements  in  these  formulae,  e.g.,  paging  accesses, 
can  be  measured  by  the  system,  and  the  constants  such  as  K1  can  be 
derived  through  system  measurements. 

Unknown  parameters,  e.g.,  the  number  of  computational  instructions 
for  a  typical  job,  must  be  evaluated  in  order  to  complete  the  formulae. 
A  way  to  approach  the  problem  of  evaluation  might  be  to  start  with 
"reasonable"  estimates  for  these  parameters.  When  the  small  scale  system 
is  operational,  they  can  be  measured  by  monitoring  system  behavior. 
Indeed,  the  purpose  of  building  the  small  scale  system  is  to  measure  the 
parameters  which  will  be  used  in  the  full-scale  system  so  that 
flexibility  in  the  design  of  the  full-scale  system  can  be  retained.  The 
small  scale  system  together  with  the  simulation  will  enable  the  designer 
to  see  what  will  work  in  the  full-scale  system. 

The  simulation  will  be  used  by  the  system  designer  in  an  iterative 
manner  in  the  course  of  specifying  the  full-scale  system.  The  scaling 
factors  will  be  specified  and  used  as  input  to  the  simulation,  the  output 
will  be  examined,  and  scaling  will  be  respecified  until  the  desired 
outputs,  i.e.,  full-scale  system  behavior  are  achieved.  A  typical 
question  would  be:  How  much  can  the  datu  base  size  be  scaled  up  with 
present  disk  hardware  without  going  below  the  minimum  required 
responsiveness  (responses  per  unit  time)?  Will  it  be  necessary  to  have 
more  and/or  faster  disks  in  order  to  achieve  the  desired  full-scale 


D-12 


system  responsiveness  end  incorporate  the  necessary  data  base  size?  If 
access  time  is  improved  by  so  much,  how  much  can  the  data  base  then  be 
scaled  up?  The  system  designer  will  look  at  the  results  of  the 
simulation  based  on  a  set  of  values  for  the  scaling  parameters  and 
iteratively  adjust  these  values.  Such  respecifications  of  scaling  may 
well  result  in  design  changes  for  the  full-scale  system,  e.g.  by  going  to 
more  and/or  more  powerful  hardware.  Thus  the  tools  to  be  used  will  be 
the  simulator  and  the  set  of  input  variables. 

The  remaining  research  on  this  task  will  involve  further  definition 
of  the  method's  details  and  insuring  that  the  simulation  has  sufficient 
realism  for  the  case  of  1DHS.  In  addition,  it  must  be  verified  that  the 
functional  relationships  between  the  scale  factor  parameters,  as  measured 
by  the  defined  metrics,  and  the  simulation  input  variables  are  valid 
relations.  If  necessary,  metrics  will  be  redefined. 
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Simulator  Variable  -  Scale  Factor  Equations 
The  report  Interrelationships  Among  Scaling  Factors  described  a 
procedure  to  relate  simulator  variables  to  scaling  parameters.  The 
definition  and  equations  have  been  refined  and  will  be  described. 

Consider  the  simulation  parameter  CMEAN,  mean  CPU  service  time. 

It  can  be  defined  as  follows: 

Cmeajj  .  instructions  executed/block 
instructions/time  (power) 

Consider  also  the  following  definitions: 

Np  “  number  of  disk  accesses 

■  NDB  (number  of  data  base  disk  accesses)  + 

Npp  (number  of  paging  disk  accesses) 

I  ■  number  of  instructions 

■  Ic  (number  of  computational  instructions)  + 

Ipg  (number  of  data  base  instructions)  + 

Ip  (number  of  paging  instructions) 

Define  the  frequency  of  data  base  disk  accesses  per  computational 
instruction, 

fdb"~ 

*C 

Then  NDB  "  Xc^y^  "  icfdb  • 

1c 

r 

Also,  Npp  -  Kp*  Ic  *  _V  ,  wh«>-  Kp  is  a  system-dependent  constant 


calculated  as  the  number  of  p  sect ; jes/computational  instruction, 

is  the  virtual  core  for  a  particular  Job  (the  job  size),  and  Cp  is  the  real 
core  for  the  job  (the  actual  core  available  for  the  job). 

Then 

N_  -  +  N__ 

D  DB  DP 
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*D  -  IctFDB  +  KP 

jc_  1 

Nd  Fdb-^p^v 
Cr 

—  is  the  number  of  instructions  per  block  so, 

nD 

CMEAN  -  [Fdb  +  KP  5X]_1 
_ CR 

instructions/time 

To  find  a  value  for  Fjjg,  estimates  and  typical  numbers  will  be  sought. 
The  value  will  depend  on  the  function  being  performed  and  the  probability 
of  having  to  make  a  disk  access.  There  are  several  factors  that  affect 
the  probability  that  a  piece  of  information  is  in  core  vs.  on  disk,  such  as 
the  amount  of  the  data  base  that  is  stored  in  core  at  any  time,  the  organi¬ 
zation  of  data  on  the  disk  (the  data  base  structure) ,  and  the  data  manipula¬ 
tion  algorithms.  Also,  since  Fpg  was  defined  as  ,  it  may  be  possible 

*C 

to  calculate  Npg  for  a  given  function  and  data  base  organization,  while 
Iq  would  also  be  a  function  of  scale  factors,  such  as  the  function  being 
performed  and  the  data  base  size  and  data  base  complexity.  Thus  Fpg  could 
be  derived  in  this  way. 

To  find  Cy,  for  each  Job  of  type  j,  assume  input  values  for  simulation 
parameters  mean  Cy(J),  Cy(J).  As  the  Job  begins,  pick  the  actual  Cy 
according  to  a  probability  distribution  function. 

For  Cg,  the  real  available  core  for  the  Job,  the  following  system- 
dependent  values  can  be  input: 

Gp  ■  total  core  for  the  machine 

Cqs  "  operating  system  core 
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Then 

Cj  _  Cos  «  Ct  “  Cos 

CR  .  -  - 

Number  of  jobs  running  Number  of  terminals 

Now  consider  the  simulation  parameter ,  DMEAN,  the  disk,  service  time. 

DMEAN  »  seek  time  -4-  disk  read  apeed*average  amount  read. 

The  seek  time  is  a  function  of  hardware,  a  scale  factor,  and  usually 
dominates  DMEAN.  Whether  the  other  element,  disk  read  speed*average  amount 
read,  is  negligible  or  not  depends  on  how  the  system  is  handled. 

Another  simulation  parameter  IMEAN,  is  defined  as  follows: 

IMEAN  *  CPU/disk  iteration  count 
•  number  of  disk  accesses 

“  nd  *  NPB  +  NDP- 

Then 

IMEAN  -  IC[FDB  +  Kp  £?]. 

CR 

If  FDB  is  difficult  to  calculate,  the  following  equation  can  be 
used  Instead: 

IMEAN  -  Ndb  +  1C*KP  fV  . 

CR 

In  the  equations  that  have  been  discussed,  the  simulator  parameters 
have  been  defined  as  functions  of  many  of  the  scale  factors,  including 
power  (instructions/time) ,  number  of  terminals,  real  core  (system  capacity), 
number  of  instructions  (related  to  data  base  complexity  4  structure),  hard¬ 
ware,  functionality,  and  security  core  (involved  in  the  calculation  of  CB, 
the  real  available  core  for  a  job).  The  use  of  the  simulator  with  experi¬ 
mental  values  will  then  permit  analysis  of  scale  factor  interrelationships. 
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Simulator  Description 

The  operating  sy6tem  performance  simulator  operates  as  follows:  a 
Job  enters  the  system  at  random  intervals  from  one  of  n  terminals.-  The 
Job  is  assigned  a  Job  class  (disk  or  CPU  bound)  and  a  CPU/disk  iteration 
count  based  on  a  probability  distribution. 

The  Job  is  placed  on  the  CPU  or  disk  queue  if  the  required  facility 
is  busy;  when  it  gains  control  of  the  CPU,  it  is  assigned  a  CPU  service 
time  based  on  a  probability  distribution  function;  similarly,  a  disk  service 
time  is  assigned  when  it  gains  control  of  the  disk.  When  the  Job  has  been 
completely  serviced,  the  terminal  that  submitted  the  Job  waits  a  period 
of  time  based  on  a  user-submitted  probability  function  until  a  new  Job  is 
submitted  from  that  terminal. 

Another  method  of  describing  the  simulation  is  by  enumeration  of  its 
elements,  i.e.,  its  objects,  terminals.  Jobs,  CPU,  and  disk,  as  shown  in 
Figure  1,  and  its  events  as  shown  in  Figure  2. 

Events  can  be  job  creation,  job  start,  CPU  event,  or  disk  event.  The 
description  of  each  is  shown  in  Figure  2. 
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Objects : 


n  terminals 

Jobs 

CPU 

Disk 

Characteristics  of  objects: 

Terminal : 
wait  time 

number  of  terminals 
active  Job 
CPU: 
queue 
active  job 
service  time  slice 
Disk: 
queue 
active  job 
service  time  slice 
Job; 

status  -  CPU  queue,  CPU  active.  Disk  queue, 
Disk  active,  completed 

Class 

CPU/Disk  iteration  count 


Figure  1.  Simulation  Characterization  of  Objects 
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Job  creation  event 


1.  creates  a  Job  object 

2.  assigns  a  job  class  &  Iteration  count  randomly 

3.  schedules  next  job  creation  event  based  on  user  creation  rate 
Job  start  event 

1.  activates  Job  by  placing  It  In  CPU  queue 

2.  schedules  CPU  event  If  CPU  queue  Is  empty 
CPU  event 

1.  If  job  has  CPU,  determine  If  It  is  finished.  If  finished: 

a.  delete  job  object 

b.  remove  job  from  terminal 

c.  schedule  a  job  start  event 

If  not  finished: 

a.  add  job  to  disk  queue 

b.  If  disk  free,  schedule  disk  event 

2.  Assign  next  Job  in  CPU  queue  (if  any)  to  CPU. 

3.  Determine  time  slice  for  this  CPU  slice. 

4.  Schedule  CPU  event  for  this  time. 

Disk  event 

1.  If  job  has  disk: 

a.  add  job  to  CPU  queue 

b.  If  CPU  free,  schedule  CPU  event 

2,  Assign  next  job  In  disk  queue  to  disk. 

3,  Determine  disk  time  slice  for  this  disk  access. 

4.  Schedule  disk  event. 

Figure  2.  Description  of  Simulator  Events 
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The  operating  system  performance  simulator  has  been  exercised  with 
various  sets  of  test  data  for  two  purposes.  First,  to  examine  how  scale 
factors  interrelate  in  a  given  environment,  and  second,  to  demonstrate 
how  the  simulator  would  be  used  in  an  actual  implementation  situation. 

As  was  pointed  out  in  the  report  entitled  "Interrelationships  Among 
Scaling  Factors",  the  relationships  between  system  parameters,  i.e., 
scale  factors,  are  complex  and  non-linear.  It  is  not  possible  to  derive 
analytic  expressions  that  will  hold  under  all  conditions.  The  simulation 
model  of  a  generalized  intelligence  data  handling  system  can  be  used  to 
predict  performance  and  to  predict  changes  in  one  variable  from  changes 
in  another.  The  simulation  thus  substitutes  for  the  nonexistence  of 
precise  analytic  functional  relationships  between  various  scaled 
parameters . 

The  test  data  set  was  designed  to  enable  system  performance  to  be 
evaluated  for  different  combinations  of  parameter  values  that  permit 
comparative  analysis  of  system  scale  factor  interrelationships.  Sane  of 
the  issues  addressed  include  hew  the  number  of  terminals,  the  mix  (the 
combination  of  CRJ -bound  and  disk -bound  jobs)  and  average  CPU  service 
time,  affect  disk  waiting  time,  CBJ  and  disk  utilization,  response  time, 
and  other  measures  of  system  performance. 

The  simulator  selects  CRJ  and  disk  service  times  and  terminal  wait 
times  using  Poisson  distributions.  This  distribution  models  arrivals  in 
a  very  satisfactory  fashion. 

The  value  of  examining  different  values  for  a  parameter  such  as  CPU 
service  time  is  that  it  is  a  way  of  simulating  added  overhead  that 
features  such  as  security  operations  may  require.  Extra  disk  accesses 
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may  also  be  required  to  perform  specific  security  functions,  so 
individual  implementations  of  security  systems  will  affect  system 
performance  in  varying  ways,  as  a  function  of  these  application-dependent 
system  parameters  (CRJ  time  and  disk  accesses) . 

The  test  nns  indicate  that  the  nunber  of  terminals  appears  to  be  a 
prime  factor  in  system  performance,  vfriile  adding  CRJ  overhead,  such  as 
security  features,  does  not  change  response  time  significantly.  As  shewn 
in  Figure  1,  plotting  the  nunber  of  terminals  against  the  average  disk 
wait  demonstrates  that  the  jcb  mix  and  nunber  of  CRJ /disk  iterations  play 
very  little  role  in  the  resultant  average  disk  wait;  regardless  of 
vhether  the  system  is  CRJ-  or  disk -bound,  the  average  disk  wait  increases 
almost  proportionately  with  the  nunber  of  terminals,  e .g • ,  the  disk  wait 
with  16  terminals  is  approximately  twice  the  disk  wait  with  8  terminals. 

As  shown  in  Table  1,  in  a  syst  .n  of  8  terminals,  with  50%  of  the 
jobs  CRJ-bound,  the  average  disk  wait  is  235  time  units.  When  67%  of  the 
jobs  are  CRJ-bound,  the  average  disk  wait  is  217  time  units.  Even  vhen 
the  system  is  made  more  heavily  CRJ-bound,  with  67%  of  the  jobs  in  this 
category,  and  an  average  CRJ  time  slice  approximately  half  of  the  disk 
time  slice,  the  average  disk  wait  for  8  terminals  is  207,  oenpared  with 
236  for  a  67%  CRJ-bound  system  with  CRJ  time  slices  only  3.5%  of  the  disk 
time  slice.  No  dramatic  changes  in  disk  wait  time  have  taken  place  from 
changing  the  job  mix  values.  Similarly,  doubling  the  average  nunber  of 
CRJ/disk  iterations  does  not  have  much  impact  on  the  average  disk  wait, 
as  demonstrated  in  Table  2.  However,  the  response  time  doubles  as  the 
nunber  of  CRJ/disk  iterations  doubles,  a  consideration  for  those  overhead 
operations  requiring  disk  accesses . 
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•  50%  CFU-Bowd  Jobs 
*672  CPU-Boimd  Jobs 


Figure  1 


m 


No.  of 
Ter¬ 
minals 

Average 

Disk  Wait 

Average 

CPU  Wait 

8 

235 

1.96 

10 

305 

1.76 

16 

477 

2.22 

20 

621 

1.92 

No.  of 
Ter¬ 
minals 

Average 

Disk  Wait 

Average 

CPU  Wait 

8 

217 

1.65 

10 

279 

1.71 

16 

458 

2.17 

20 

618 

1.85 

50Z  Job  Mix 


67Z  Job  Mix 


Table  1:  Average  Disk  &  CPU  Waits 


: 
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Double  CPU/Disk  Iterations 


No.  of 
Ter¬ 
minals 

Average 

Disk  Wait 

Average 

Response 

Time 

8 

240 

2323 

10 

308 

2763 

16 

477 

4009 

20 

609 

4930 

502  Job  Mix 

I 

Sase  CPU/Disk 
Interat ions 

No.  of 
Ter¬ 
minals 

Average 

Disk  Wait 

Average 

Response 

Time 

8 

226 

1128 

10 

303 

1457 

16 

477 

2040 

20 

632 

2788 

No .  of 
Ter¬ 
minals 

Average 

Disk  Wait 

Average 

Response 

Time 

8 

223 

1910 

10 

297 

2301 

16 

447 

3396 

20 

623 

4218 

672  Job  Mix 


No.  of  Average  Average 

Ter-  Dl6k  Wait  Response 

minals  Time 


50*  Job  Mix 


672  Job  Mix 


Table  2:  Average  Disk  Waits  A  Response  Tines 
for  Base  A  Doubled  CPU/Disk 
Interat ions 


Disk  utilization  is  consistently  over  99%  regardless  of  the 
variations  in  the  values  for  the  parameters .  This  result  is  to  be 
expected  due  to  the  fact  that  ncet  canputer  systems  will  be  limited  by 
the  nature  of  the  disk  hardware,  i.e. ,  its  speed. 

CPU  utilization  remains  at  about  2%  to  3%  vhen  the  average  CPU  time 
slice  is  approximately  3.5%  of  the  average  disk  time  slice  and  increases 
to  15-25%  when  the  CRJ  time  slice  is  increased  to  approximately  half  that 
of  the  disk  time  slice.  Thus,  it  is  difficult  to  come  anywhere  near 
loading  the  CRJ. 

Figure  2  shows  how  the  response  time  reacts  to  changes  in  the  job 
mix.  As  might  be  expected,  response  time  is  lowest  when  the  largest 
percentage  of  jobs  is  CRJ-bomd,  i.e.,  the  67%  curve.  The  rate  of  change 
in  response  time  as  the  nunber  of  terminals  increases  can  be  seen  to  be 
fairly  consistent.  The  three  upper  curves  plot  the  response  time 
resulting  when  the  nunber  of  CRJ/disk  iterations  is  doubled. 

It  is  valuable,  too,  to  examine  what  happens  when  the  terminal  wait 
time  or  "think  time"  is  approximately  doubled.  This  factor  relates  to 
job  rate  (the  nunber  of  jobs  input  per  time).  Table  3  summarizes  the 
results  of  test  runs  which  indicate  that  there  is  very  little  change  in 
the  response  time  When  the  wait  time  is  doubled;  sometimes,  it  increases 
a  bit,  sometimes  it  decreases,  and  sometimes  it  does  not  change.  Thus, 
it  would  appear  that  terminal  wait  time  can  be  changed  within  reasonable 
limits  without  significantly  affecting  response  time. 

It  must  be  kept  in  mind  that  the  parameter  values  used  in  the 
simulator  for  these  experiments  are  just  one  attempt  at  approximating  a 
real  system  and  an  exanple  of  how  the  simulator  would  be  used  mder  real 
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Mo.  of 

Response  Time  for 

Response  Time  for 

Terminals 

1  Wait  Time 

Double  Wait  Time 

8 

911 

856 

1691 

1934 

10 

1025 

1067 

2485 

2112 

16 

1728 

1966 

3559 

3249 

20 

2327 

2241 

4784 

4733 

Comparison  of  Response  Times  When  Wait  Time 
Doubles 


Table  3 


r 


circixnstances .  That  is,  conclusions  have  been  reached  based  on  tests 
involving,  for  example,  2C  vs.  10  terminals.  These  conclusions  might  not 
hold  v«hen  one  is  considering  scaling  100  terminals  to  50  terminals. 
Recent  expansion  of  the  simulator's  capabilities  has  trade  possible 
experiments  with  a  larger  number  of  terminals  up  to  a  maximum  of  100, 
permitting  the  examination  of  interrelationships  in  that  operating 
region.  The  results  of  these  tests  will  be  described  in  a  later  report. 

The  value  of  the  present  results,  hcwever,  is  in  pinpointing  those 
scale  factor  interrelationships  that  should  receive  attention  vhen 
scaling  is  required. 
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selected  acquisition  programs  In  Support  of  Command,  Control 
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communications,  electromagnetic  guidance  and  contAol,  sur¬ 
veillance  o^  ground  and  aerospace  obiects,  Intelligence  data 
collection  and  handling,  Information  system  technology, 
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